I'm working with numbers with tens of thousands of digits in python. The long type works beautifully in performing math on these numbers, however I'm unable to access the highest digits of these numbers in a sufficiently fast way. Note that I don't know exactly how many digits the number contains. The "highest digits" refers to the digits in the most significant place, the lowest digits can be accessed quickly using modulus.
I can think of two ways to access these digits in python but they're both too slow for my purposes. I have tried converting to a string and accessing digits through array methods, however type conversions are slow when you have 10,000+ digits. Alternatively I could simply mask out bits and truncate, but this requires that I know how many digits are in the long. Finding the number of digits in the long would require a loop over a counter and a mask test, this will surely be slower than string conversion.
From the description here it seems that the long type does in fact contain a bignum array. Is there some way I can access the underlying data structure that stores the long, or possibly check how many digits the long has from the base type?
If people are interested I can provide an example with benchmarks.
A simple approach without digging on low level implementation of the long type:
>>> n = 17**987273 # 1.2 million digits number
>>> digits = int(math.log10(n))
>>> k = digits - 24 # i.e. first 24 digits
>>> n / (10 ** k)
9953043281569299242668853L
Runs quite fast on my machine. I tried to get the string representation of this number and it takes a huge time.
For Python 3.x, use n // (10 ** k)
Some timings with this big number (It is 140 times faster):
%timeit s = str(n)[:24]
1 loops, best of 3: 57.7 s per loop
%timeit n/10**(int(math.log10(n))-24)
1 loops, best of 3: 412 ms per loop
# With a 200K digits number (51x faster)
%timeit s = str(n)[:24]
1 loops, best of 3: 532 ms per loop
%timeit n/10**(int(math.log10(n))-24)
100 loops, best of 3: 10.4 ms per loop
# With a 20K digits number (19x faster)
%timeit s = str(n)[:24]
100 loops, best of 3: 5.4 ms per loop
%timeit n/10**(int(math.log10(n))-24)
1000 loops, best of 3: 272 us per loop
Python 2.7 has the bit_length()
method on integers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With