Accessing the highest digits of large numbers from Python long

Question 1

A simple approach without digging on low level implementation of the long type:

>>> n = 17**987273 # 1.2 million digits number

>>> digits = int(math.log10(n))

>>> k = digits - 24 # i.e. first 24 digits

>>> n / (10 ** k)
9953043281569299242668853L

Runs quite fast on my machine. I tried to get the string representation of this number and it takes a huge time.

For Python 3.x, use n // (10 ** k)

Some timings with this big number (It is 140 times faster):

%timeit s = str(n)[:24]
1 loops, best of 3: 57.7 s per loop

%timeit n/10**(int(math.log10(n))-24)
1 loops, best of 3: 412 ms per loop


# With a 200K digits number (51x faster)

%timeit s = str(n)[:24]
1 loops, best of 3: 532 ms per loop

%timeit n/10**(int(math.log10(n))-24)
100 loops, best of 3: 10.4 ms per loop


# With a 20K digits number (19x faster)

%timeit s = str(n)[:24]
100 loops, best of 3: 5.4 ms per loop

%timeit n/10**(int(math.log10(n))-24)
1000 loops, best of 3: 272 us per loop

Question 2

Python 2.7 has the bit_length() method on integers.

Question 3

Here is a very ugly one liner that will extract the first few decimal digits:

(x >> (x.bit_length()-50))*(10**(math.fmod((x.bit_length()-50)*math.log(2)/math.log(10), 1)))

If your value for x is around 10,000 decimal digits long, you should get an answer accurate to around 12 digits or so. As x gets larger, your accuracy will decrease.

If you are willing to use external modules, I would look at gmpy2. The gmpy2 library provides access to the GMP (or MPIR) library for multiple-precision integer and fractional arithmetic, the MPFR library for multiple-precision floating point arithmetic, and the MPC library for multiple-precision complex arithmetic. gmpy2 integers are faster than Python's native longs and you can convert a long integer into a floating point number to extract just the leading digits. The above one liner just becomes:

gmpy2.mpfr(x).digits()[0]

The gmpy2 approach will retain accuracy even as the numbers become larger.

Disclaimer: I maintain gmpy2.