Not sure if title is correct terminology.
If you have to compare the characters in 2 strings (A,B) and count the number of matches of chars in B against A:
sum([ch in A for ch in B])
is faster on %timeit than
sum(ch in A for ch in B)
I understand that the first one will create a list of bool, and then sum the values of 1. The second one is a generator. I'm not clear on what it is doing internally and why it is slower?
Thanks.
Edit with %timeit results:
10 characters
generator expression
list
10000 loops, best of 3: 112 µs per loop
10000 loops, best of 3: 94.6 µs per loop
1000 characters
generator expression
list
100 loops, best of 3: 8.5 ms per loop
100 loops, best of 3: 6.9 ms per loop
10,000 characters
generator expression
list
10 loops, best of 3: 87.5 ms per loop
10 loops, best of 3: 76.1 ms per loop
100,000 characters
generator expression
list
1 loop, best of 3: 908 ms per loop
1 loop, best of 3: 840 ms per loop
Thus, generator expressions are faster than list comprehension and hence time efficient.
List comprehensions and generators are not different at all; they are just different ways of writing the same thing. A list comprehension produces a list as output, a generator produces a generator object.
As we can see, the for loop is slower than the list comprehension (9.9 seconds vs. 8.2 seconds). List comprehensions are faster than for loops to create lists. But, this is because we are creating a list by appending new elements to it at each iteration.
The only difference between Generator Comprehension and List Comprehension is that the former uses parentheses.
I took a look at the disassembly of each construct (using dis). I did this by declaring these two functions:
def list_comprehension():
return sum([ch in A for ch in B])
def generation_expression():
return sum(ch in A for ch in B)
and then calling dis.dis
with each function.
For the list comprehension:
0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
4 FOR_ITER 12 (to 18)
6 STORE_FAST 1 (ch)
8 LOAD_FAST 1 (ch)
10 LOAD_GLOBAL 0 (A)
12 COMPARE_OP 6 (in)
14 LIST_APPEND 2
16 JUMP_ABSOLUTE 4
18 RETURN_VALUE
and for the generator expression:
0 LOAD_FAST 0 (.0)
2 FOR_ITER 14 (to 18)
4 STORE_FAST 1 (ch)
6 LOAD_FAST 1 (ch)
8 LOAD_GLOBAL 0 (A)
10 COMPARE_OP 6 (in)
12 YIELD_VALUE
14 POP_TOP
16 JUMP_ABSOLUTE 2
18 LOAD_CONST 0 (None)
20 RETURN_VALUE
The disassembly for the actual summation is:
0 LOAD_GLOBAL 0 (sum)
2 LOAD_CONST 1 (<code object <genexpr> at 0x7f49dc395240, file "/home/mishac/dev/python/kintsugi/KintsugiModels/automated_tests/a.py", line 12>)
4 LOAD_CONST 2 ('generation_expression.<locals>.<genexpr>')
6 MAKE_FUNCTION 0
8 LOAD_GLOBAL 1 (B)
10 GET_ITER
12 CALL_FUNCTION 1
14 CALL_FUNCTION 1
16 RETURN_VALUE
but this sum
disassembly was constant between both your examples, with the only difference being the loading of generation_expression.<locals>.<genexpr>
vs list_comprehension.<locals>.<listcomp>
(so just loading a different local variable).
The differing bytecode instructions between the first two disassemblies are LIST_APPEND
for the list comprehension vs. the conjunction of YIELD_VALUE
and POP_TOP
for the generator expression.
I won't pretend I know the intrinsics of Python bytecode, but what I gather from this is that the generator expression is implemented as a queue, where the value is generated and then popped. This popping doesn't have to happen in a list comprehension, leading me to believe there'll be a slight amount of overhead in using generators.
Now this doesn't mean that generators are always going to be slower. Generators excel at being memory-efficient, so there will be a threshold N such that list comprehensions will perform slightly better before this threshold (because memory use won't be a problem), but after this threshold, generators will significantly perform better.
Generators are usually slower than list comprehension, the whole point of generators is to be memory efficient because they yield each item by creating them in a lazy fashion (only when actually needed). They favor memory efficency over speed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With