Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a generator produced by yield faster than a generator produced by xrange?

I was investigating Python generators and decided to run a little experiment.

TOTAL = 100000000
def my_sequence():
    i = 0
    while i < TOTAL:
        yield i
        i += 1

def my_list():
    return range(TOTAL)

def my_xrange():
    return xrange(TOTAL)    

Memory usage (using psutil to get process RSS memory) and time taken (using time.time()) are shown below after running each method several times and taking the average:

sequence_of_values = my_sequence() # Memory usage: 6782976B  Time taken: 9.53674e-07 s

sequence_of_values2 = my_xrange() # Memory usage: 6774784B  Time taken: 2.14576e-06 s

list_of_values = my_list() # Memory usage: 3266207744B  Time taken: 1.80253s

I noticed that producing a generator by using xrange is consistently (slightly) slower than that by using yield. Why is that so?

like image 417
Yan Yi Avatar asked Jul 28 '16 02:07

Yan Yi


People also ask

Why is Xrange faster than range?

Because of the fact that xrange() evaluates only the generator object containing only the values that are required by lazy evaluation, therefore is faster in implementation than range().

How does yield work?

Yield is a return measure for an investment over a set period of time, expressed as a percentage. Yield includes price increases as well as any dividends paid, calculated as the net realized return divided by the principal amount (i.e. amount invested).

What does the yield keyword do?

In simpler words, the yield keyword will convert an expression that is specified along with it to a generator object and return it to the caller. Hence, if you want to get the values stored inside the generator object, you need to iterate over it.

What is the purpose of yield keyword in Python?

The yield keyword in Python controls the flow of a generator function. This is similar to a return statement used for returning values in Python.


2 Answers

I'm going to preface this answer by saying that timings on this scale are likely going to be hard to measure accurately (probably best to use timeit) and that these sorts of optimizations will almost never make any difference in your actual program's runtime ...

Ok, now the disclaimer is done ...

The first thing that you need to notice is that you're only timing the construction of the generator/xrange object -- You are NOT timing how long it takes to actually iterate over the values1. There are a couple reasons why creating the generator might be faster in some cases than creating the xrange object...

  1. For the generator case, you're only creating a generator -- No code in the generator actually gets run. This amounts to roughly 1 function call.
  2. For the xrange case, you're calling the function and then you have to lookup the global name xrange, the global TOTAL and then you need to call that builtin -- So there are more things being executed in this case.

As for memory -- In both of the lazy approaches, the memory used will be dominated by the python runtime -- Not by the size of your generator objects. The only case where the memory use is impacted appreciably by your script is the case where you construct a list of 100million items.

Also note that I can't actually confirm your results consistently on my system... Using timeit, I actually get that my_xrange is sometimes2 faster to construct (by ~30%).

Adding the following to the bottom of your script:

from timeit import timeit
print timeit('my_xrange()', setup='from __main__ import my_xrange')
print timeit('my_sequence()', setup='from __main__ import my_sequence')

And my results are (for CPython on OS-X El-Capitan):

0.227491140366
0.356791973114

However, pypy seems to favor the generator construction (I tried it with both my_xrange first and my_sequence first and got fairly consistent results though the first one to run does seem to be at a bit of a disadvantage -- Maybe due to JIT warm-up time or something):

0.00285911560059
0.00137305259705

1Here, I would expect xrange to have the edge -- but again, nothing is true until you timeit and then it's only true if the timings differences are significant and it's only true on the computer where you did the timings.
2See opening disclaimer :-P

like image 164
mgilson Avatar answered Sep 29 '22 11:09

mgilson


As I mentioned in my comment above, with your generator function and with xrange, you're not actually creating the sequence, merely creating the object. @mgilson's answer covers the calls related to creating them.

As for actually doing something with them:

>>> TOTAL = 100000
>>> # your functions here
...
>>> import timeit
>>> timeit.timeit("list(my_seq())", setup="from __main__ import my_seq", number=1000)
9.783777457339898
>>> timeit.timeit("list(my_xrange())", setup="from __main__ import my_xrange", number=1000)
1.2652621698083024
>>> timeit.timeit("list(my_list())", setup="from __main__ import my_list", number=1000)
2.666709824464867
>>> timeit.timeit("my_list()", setup="from __main__ import my_list", number=1000)
1.2324339537661615
  1. You'll see that I'm creating a list out of each so I'm processing the sequences.

  2. The generator function is nearly 10x the time for xrange.

  3. list(my_list) is redundant since my_list already returns the list produced by range, so I did it one more time without the call to list().

  4. range is nearly the same as xrange but that's because I reduced TOTAL. The biggest difference there would be that range would consume more memory since it creates the entire list first and so takes longer only in that part. Creating a list from xrange = range, effectively. So the final memory used would be the same and since I'm merely creating a list out of the xrange, it's hard to see the difference in this trivial case.

like image 27
aneroid Avatar answered Sep 29 '22 12:09

aneroid