In a slightly contrived experiment I wanted to compare some of Python's built-in functions to those of numpy. When I started timing these though, I found something bizarre.
When I wrote the following:
import timeit
timeit.timeit('import math; math.e**2', number=1000000)
I would get two different results in almost random alternation in a very statistically significant way.
This alternates between 2 seconds, and 0.5 seconds.
This confused me so I ran some experiments to figure out what was going on and I was only more confused. So I tried the following experiments:
[timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100)]
which led entirely to the 0.5 number. I then tried seeding this with a generator:
test = (timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100))
[item for item in test]
which led to a list entirely full of the 2.0 number.
On the suggestion of alecxe I changed my timeit statement to:
timeit.timeit('math.e**2', 'import math', number=1000000)
which similarly alternated between about 0.1 and 0.4 seconds, but when I reran the experiment comparing generators and list comprehensions, but this time the results were flipped. That is to say that the generator expression regularly came up with the 0.1 second number, while the list comprehension returned a full list of the 0.4 second number.
Direct console output:
>>> test = (timeit.timeit('math.e**2', 'import math', number=1000000) for i in xrange(100))
>>> test.next()
0.15114784240722656
>>> timeit.timeit('math.e**2', 'import math', number=1000000)
0.44176197052001953
>>>
Edit: I'm using Ubuntu 12.04 running dwm, and I've seen these results both in xterm and a gnome-terminal. I'm using python 2.7.3
Does anybody know what's going on here? This seems really bizarre to me.
%%timeit. You can use the magic command %%timeit to measure the execution time of the cell. As an example, try executing the same process using NumPy . As with %timeit , -n and -r are optional.
The “%timeit” is a line magic command in which the code consists of a single line or should be written in the same line for measuring the execution time. In the “%timeit” command, the particular code is specified after the “%timeit” is separated by a space.
timeit() documentation: Time number executions of the main statement. This executes the setup statement once, and then returns the time it takes to execute the main statement a number of times, measured in seconds as a float.
The default value of this parameter is 1 million (1000000) The return value of this function timeit. timit() is in seconds which returns the value it took to execute the code snippets in seconds value.
Turns out there were a couple things happening here, though apparently some of these quirks my be specific to my machine, but nevertheless I figure it's worth posting them in case someone is puzzled by the same thing.
Firstly, there's a different between the two timeit functions in that the:
timeit.timeit('math.e**2', 'import math', number=1000000)
the import statements are lazily loaded. This becomes obvious if you try the following experiment:
timeit.timeit('1+1', 'import math', number=1000000)
versus:
timeit.timeit('1+1', number=1000000)
So when it was directly run in the list comprehension it looks like this import statement was being loaded for every entry. (Exact reasons for this are probably related to my configuration).
Past that, going back to the original question, it looks like 3/4 of the time was actually spent import math, so I'm guessing that when the equation was generated, there was no cache storage between iterations, while there was import caching within the list comprehension (again, the exact reason for this is probably configuration specific)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With