I read about List comprehension without [ ] in Python so now I know that
''.join([str(x) for x in mylist])
is faster than
''.join(str(x) for x in mylist)
because "list comprehensions are highly optimized"
So I suppose that the optimization relies on the parsing of the for
expression, sees mylist
, computes its length, and uses it to pre-allocate the exact array size, which saves a lot of reallocation.
When using ''.join(str(x) for x in mylist)
, join
recieves a generator blindly and has to build its list without knowing the size in advance.
But now consider this:
mylist = [1,2,5,6,3,4,5]
''.join([str(x) for x in mylist if x < 4])
How does python decide of the size of the list comprehension? Is it computed from the size of mylist
, and downsized when iterations are done (which could be very bad if the list is big and the condition filters out 99% of the elements), or does it revert back to the "don't know the size in advance" case?
EDIT: I've done some small benchmarks and it seems to confirm that there's an optimization:
without a condition:
import timeit
print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234]])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234])"))
yields (as expected):
3.11010817019474
3.3457350077491026
with a condition:
print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50)"))
yields:
2.7942209702566965
3.0316467566203276
so conditional listcomp still is faster.
List comprehensions provide us with a simple way to create a list based on some sequence or another list that we can loop over. In python terminology, anything that we can loop over is called iterable. At its most basic level, list comprehension is a syntactic construct for creating lists from existing lists.
As we can see, the for loop is slower than the list comprehension (9.9 seconds vs. 8.2 seconds). List comprehensions are faster than for loops to create lists. But, this is because we are creating a list by appending new elements to it at each iteration.
Because of differences in how Python implements for loops and list comprehension, list comprehensions are almost always faster than for loops when performing operations.
So what's the difference between Generator Expressions and List Comprehensions? The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list. Thus we can say that the generator expressions are memory efficient than the lists.
List comprehensions don't pre-size the list, even when they totally could. You're assuming the presence of an optimization that isn't actually done.
The list comprehension is faster because all the iterator machinery and the work of entering and exiting the genexp stack frame has a cost. The list comprehension doesn't need to pay that cost.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With