Currently I was learning about generators and list comprehension, and messing around with the profiler to see about performance gains stumbled into this cProfile of a sum of prime numbers in a large range using both.
I can see that in the generator the :1 genexpr as cumulative time way shorter than in its list counterpart, but the second line is what baffles me. Is doing a call which I think is the check for number is prime, but then isn't supposed to be another :1 module in the list comprehension?
Am I missing something in the profile?
In [8]: cProfile.run('sum((number for number in xrange(9999999) if number % 2 == 0))')
5000004 function calls in 1.111 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
5000001 0.760 0.000 0.760 0.000 <string>:1(<genexpr>)
1 0.000 0.000 1.111 1.111 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.351 0.351 1.111 1.111 {sum}
In [9]: cProfile.run('sum([number for number in xrange(9999999) if number % 2 == 0])')
3 function calls in 1.123 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.075 1.075 1.123 1.123 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.048 0.048 0.048 0.048 {sum}
First of all the calls are to next
(or __next__
in Python 3) method of the generator object not for some even number check.
In Python 2 you are not going to get any additional line for a list comprehension(LC) because LC are not creating any object, but in Python 3 you will because now to make it similar to a generator expression an additional code object(<listcomp>
) is created for a LC as well.
>>> cProfile.run('sum([number for number in range(9999999) if number % 2 == 0])')
5 function calls in 1.751 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.601 1.601 1.601 1.601 <string>:1(<listcomp>)
1 0.068 0.068 1.751 1.751 <string>:1(<module>)
1 0.000 0.000 1.751 1.751 {built-in method exec}
1 0.082 0.082 0.082 0.082 {built-in method sum}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
>>> cProfile.run('sum((number for number in range(9999999) if number % 2 == 0))')
5000005 function calls in 2.388 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
5000001 1.873 0.000 1.873 0.000 <string>:1(<genexpr>)
1 0.000 0.000 2.388 2.388 <string>:1(<module>)
1 0.000 0.000 2.388 2.388 {built-in method exec}
1 0.515 0.515 2.388 2.388 {built-in method sum}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
The number of calls are different though 1(LC) compared to 5000001 in generator expression, this is most because sum
is consuming the iterator hence has to call its __next__
method 500000 + 1 times(last 1 is probably for StopIteration
to end the iteration). For a list comprehension all the magic happens inside its code object where the LIST_APPEND
helps it in appending items one by one to the list, i.e no visible calls for cProfile
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With