Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: how so fast?

The period of the Mersenne Twister used in the module random is (I am told) 2**19937 - 1. As a binary number, that is 19937 '1's in a row (if I'm not mistaken). Python converts it to decimal pretty darned fast:

$ python -m timeit '2**19937'
10000000 loops, best of 3: 0.0271 usec per loop

$ python -m timeit -s 'result = 0' 'result += 2**19937'
100000 loops, best of 3: 2.09 usec per loop

I guess the second version is the one that requires conversion?

And it's not just binary. This is also fast. (Rather than show the numbers, I show the length of the decimal converted to a string):

>>> import math
>>> N = 1000
>>> s = str((int(N*math.e))**(int(N*math.pi)))
>>> len(s)
10787
>>> N = 5000
>>> s = str((int(N*math.e))**(int(N*math.pi)))
>>> len(s)
64921

Timing:

python -m timeit -s 'import math' -s 'N=1000' 's = str((int(N*math.e))**(int(N*math.pi)))'
10 loops, best of 3: 51.2 msec per loop

The question is: how is this actually done?

Am I just naive to be impressed? I find the sight of the Python shell generating a number of 5000 or so places in an instant truly spectacular.

Edit:

Additional timings suggested by @dalke and @truppo

$ python -m timeit 'x=2' 'x**19937'
1000 loops, best of 3: 230 usec per loop
$ python -m timeit 'x=2' 'int(x**19937)'
1000 loops, best of 3: 232 usec per loop
$ python -m timeit 'x=2' 'str(x**19937)'
100 loops, best of 3: 16.6 msec per loop

$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937'
1000 loops, best of 3: 237 usec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937' 'int(result)'
1000 loops, best of 3: 238 usec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937' 'str(result)'
100 loops, best of 3: 16.6 msec per loop

So it looks to me like result = 0; result += 2**19937 probably does force the conversion.

like image 709
telliott99 Avatar asked Jan 23 '10 23:01

telliott99


2 Answers

Hate to rain on your parade, but the reason it's so fast is because the math module is actually not implemented in Python.

Python supports loading shared libraries that export Python APIs, but are implemented in other languages. math.so, which provides the module you get from import math, happens to be one of those (and so is _random.so).

like image 73
clee Avatar answered Sep 19 '22 18:09

clee


When compiling to byte code, constant expressions such as 2**19937 will optimized down to a single constant:

>>> def foo(): return 2**10
... 
>>> import dis
>>> dis.dis(foo)
  1           0 LOAD_CONST               3 (1024)
              3 RETURN_VALUE        
>>> 

Consider instead:

[~] python -m timeit 'x=2' 'x**19937'
1000 loops, best of 3: 210 usec per loop
like image 37
mthurlin Avatar answered Sep 17 '22 18:09

mthurlin