So I got these examples from the official documentation. https://docs.python.org/2/library/timeit.html
What exactly makes the first example (generator expression) slower than the second (list comprehension)?
>>> timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
0.8187260627746582
>>> timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)
0.7288308143615723
List comprehensions return the entire list, and the generator expression returns only the generator object. The values will be the same as those in the list, but they will be accessed one at a time by using the next() function. This is what makes list comprehensions faster than generator expressions.
List comprehensions are usually faster than generator expressions as generator expressions create another layer of overhead to store references for the iterator. However, the performance difference is often quite small.
The only difference between Generator Comprehension and List Comprehension is that the former uses parentheses.
So what's the difference between Generator Expressions and List Comprehensions? The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list. Thus we can say that the generator expressions are memory efficient than the lists.
The str.join
method converts its iterable parameter to a list if it's not a list or tuple already. This lets the joining logic iterate over the items multiple times (it makes one pass to calculate the size of the result string, then a second pass to actually copy the data).
You can see this in the CPython source code:
PyObject *
PyUnicode_Join(PyObject *separator, PyObject *seq)
{
/* lots of variable declarations at the start of the function omitted */
fseq = PySequence_Fast(seq, "can only join an iterable");
/* ... */
}
The PySequence_Fast
function in the C API does just what I described. It converts an arbitrary iterable into a list (essentially by calling list
on it), unless it already is a list or tuple.
The conversion of the generator expression to a list means that the usual benefits of generators (a smaller memory footprint and the potential for short-circuiting) don't apply to str.join
, and so the (small) additional overhead that the generator has makes its performance worse.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With