In a slightly contrived experiment I wanted to compare some of Python's built-in functions to those of numpy. When I started timing these though, I found something bizarre. When I wrote the following: <pre class="prettyprint"><code>import timeit timeit.timeit('import math; math.e**2', number=1000000) </code></pre> I would get two different results in almost random alternation in a very statistically significant way. This alternates between 2 seconds, and 0.5 seconds. This confused me so I ran some experiments to figure out what was going on and I was only more confused. So I tried the following experiments: <pre class="prettyprint"><code>[timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100)] </code></pre> which led entirely to the 0.5 number. I then tried seeding this with a generator: <pre class="prettyprint"><code>test = (timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100)) [item for item in test] </code></pre> which led to a list entirely full of the 2.0 number. On the suggestion of alecxe I changed my timeit statement to: <pre class="prettyprint"><code>timeit.timeit('math.e**2', 'import math', number=1000000) </code></pre> which similarly alternated between about 0.1 and 0.4 seconds, but when I reran the experiment comparing generators and list comprehensions, but this time the results were flipped. That is to say that the generator expression regularly came up with the 0.1 second number, while the list comprehension returned a full list of the 0.4 second number. Direct console output: <pre class="prettyprint"><code>>>> test = (timeit.timeit('math.e**2', 'import math', number=1000000) for i in xrange(100)) >>> test.next() 0.15114784240722656 >>> timeit.timeit('math.e**2', 'import math', number=1000000) 0.44176197052001953 >>> </code></pre> Edit: I'm using Ubuntu 12.04 running dwm, and I've seen these results both in xterm and a gnome-terminal. I'm using python 2.7.3 Does anybody know what's going on here? This seems really bizarre to me.

Turns out there were a couple things happening here, though apparently some of these quirks my be specific to my machine, but nevertheless I figure it's worth posting them in case someone is puzzled by the same thing. Firstly, there's a different between the two timeit functions in that the: <pre class="prettyprint"><code>timeit.timeit('math.e**2', 'import math', number=1000000) </code></pre> the import statements are lazily loaded. This becomes obvious if you try the following experiment: <pre class="prettyprint"><code>timeit.timeit('1+1', 'import math', number=1000000) </code></pre> versus: <pre class="prettyprint"><code>timeit.timeit('1+1', number=1000000) </code></pre> So when it was directly run in the list comprehension it looks like this import statement was being loaded for every entry. (Exact reasons for this are probably related to my configuration). Past that, going back to the original question, it looks like 3/4 of the time was actually spent import math, so I'm guessing that when the equation was generated, there was no cache storage between iterations, while there was import caching within the list comprehension (again, the exact reason for this is probably configuration specific)

Two very different but very consistent results from Python timeit

Tags:

performance

python

generator

list-comprehension

In a slightly contrived experiment I wanted to compare some of Python's built-in functions to those of numpy. When I started timing these though, I found something bizarre.

When I wrote the following:

import timeit
timeit.timeit('import math; math.e**2', number=1000000)

I would get two different results in almost random alternation in a very statistically significant way.

This alternates between 2 seconds, and 0.5 seconds.

This confused me so I ran some experiments to figure out what was going on and I was only more confused. So I tried the following experiments:

[timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100)]

which led entirely to the 0.5 number. I then tried seeding this with a generator:

test = (timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100))
[item for item in test]

which led to a list entirely full of the 2.0 number.

On the suggestion of alecxe I changed my timeit statement to:

timeit.timeit('math.e**2', 'import math', number=1000000)

which similarly alternated between about 0.1 and 0.4 seconds, but when I reran the experiment comparing generators and list comprehensions, but this time the results were flipped. That is to say that the generator expression regularly came up with the 0.1 second number, while the list comprehension returned a full list of the 0.4 second number.

Direct console output:

>>> test = (timeit.timeit('math.e**2', 'import math', number=1000000) for i in xrange(100))
>>> test.next()
0.15114784240722656

>>> timeit.timeit('math.e**2', 'import math', number=1000000)
0.44176197052001953
>>>

Edit: I'm using Ubuntu 12.04 running dwm, and I've seen these results both in xterm and a gnome-terminal. I'm using python 2.7.3

Does anybody know what's going on here? This seems really bizarre to me.

368

asked Sep 01 '13 18:09

Slater Victoroff

1 Answers

Turns out there were a couple things happening here, though apparently some of these quirks my be specific to my machine, but nevertheless I figure it's worth posting them in case someone is puzzled by the same thing.

Firstly, there's a different between the two timeit functions in that the:

timeit.timeit('math.e**2', 'import math', number=1000000)

the import statements are lazily loaded. This becomes obvious if you try the following experiment:

timeit.timeit('1+1', 'import math', number=1000000)

versus:

timeit.timeit('1+1', number=1000000)

So when it was directly run in the list comprehension it looks like this import statement was being loaded for every entry. (Exact reasons for this are probably related to my configuration).

Past that, going back to the original question, it looks like 3/4 of the time was actually spent import math, so I'm guessing that when the equation was generated, there was no cache storage between iterations, while there was import caching within the list comprehension (again, the exact reason for this is probably configuration specific)

197

answered Nov 14 '22 23:11

Slater Victoroff

Related questions
                            
                                Broken pipe during stream
                            
                                MongoEngine: A ReferenceField only accepts DBRef or documents when defining document_type as str
                            
                                Big Graph visualization on a webpage : networkx, vivagraph
                            
                                What tools are available to visualize in-class dependencies (e.g. for PHP)? [closed]
                            
                                Implementing a Facebook BigPipe System in Pyramid
                            
                                Running a Celery worker in unittest
                            
                                Python tk framework
                            
                                Is there any way to use django shell without restarting when change code
                            
                                Detection of symmetries in Python
                            
                                Formatting the output as XML with lxml
                            
                                Running scrapy from inside Python script - CSV exporter doesn't work
                            
                                Python xlwt: preserve all styles but one
                            
                                Read MS Excel XML file to pandas dataframe?
                            
                                setting argtype for python callback function
                            
                                limits of python in parallel file processing
                            
                                How run a Python script from another script and get resulting global dict?
                            
                                Porting Django Project to 1&1 Shared Hosting Web-server
                            
                                Selenium freezes after waiting for element (Python)
                            
                                Access browser logs in Selenium?
                            
                                How do I specify a namespace for an xml tag with BeautifulSoup4?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Two very different but very consistent results from Python timeit

Tags:

performance

python

generator

list-comprehension

Slater Victoroff

People also ask

1 Answers

Slater Victoroff

Recent Activity

Donate For Us