The period of the Mersenne Twister used in the module random
is (I am told) 2**19937 - 1. As a binary number, that is 19937 '1's in a row (if I'm not mistaken). Python converts it to decimal pretty darned fast:
$ python -m timeit '2**19937'
10000000 loops, best of 3: 0.0271 usec per loop
$ python -m timeit -s 'result = 0' 'result += 2**19937'
100000 loops, best of 3: 2.09 usec per loop
I guess the second version is the one that requires conversion?
And it's not just binary. This is also fast. (Rather than show the numbers, I show the length of the decimal converted to a string):
>>> import math
>>> N = 1000
>>> s = str((int(N*math.e))**(int(N*math.pi)))
>>> len(s)
10787
>>> N = 5000
>>> s = str((int(N*math.e))**(int(N*math.pi)))
>>> len(s)
64921
Timing:
python -m timeit -s 'import math' -s 'N=1000' 's = str((int(N*math.e))**(int(N*math.pi)))'
10 loops, best of 3: 51.2 msec per loop
The question is: how is this actually done?
Am I just naive to be impressed? I find the sight of the Python shell generating a number of 5000 or so places in an instant truly spectacular.
Edit:
Additional timings suggested by @dalke and @truppo
$ python -m timeit 'x=2' 'x**19937'
1000 loops, best of 3: 230 usec per loop
$ python -m timeit 'x=2' 'int(x**19937)'
1000 loops, best of 3: 232 usec per loop
$ python -m timeit 'x=2' 'str(x**19937)'
100 loops, best of 3: 16.6 msec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937'
1000 loops, best of 3: 237 usec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937' 'int(result)'
1000 loops, best of 3: 238 usec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937' 'str(result)'
100 loops, best of 3: 16.6 msec per loop
So it looks to me like result = 0; result += 2**19937
probably does force the conversion.
Hate to rain on your parade, but the reason it's so fast is because the math module is actually not implemented in Python.
Python supports loading shared libraries that export Python APIs, but are implemented in other languages. math.so, which provides the module you get from import math
, happens to be one of those (and so is _random.so).
When compiling to byte code, constant expressions such as 2**19937
will optimized down to a single constant:
>>> def foo(): return 2**10
...
>>> import dis
>>> dis.dis(foo)
1 0 LOAD_CONST 3 (1024)
3 RETURN_VALUE
>>>
Consider instead:
[~] python -m timeit 'x=2' 'x**19937'
1000 loops, best of 3: 210 usec per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With