Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

List vs generator comprehension speed with join function [duplicate]

So I got these examples from the official documentation. https://docs.python.org/2/library/timeit.html

What exactly makes the first example (generator expression) slower than the second (list comprehension)?

>>> timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
0.8187260627746582
>>> timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)
0.7288308143615723
like image 968
Kevin Avatar asked Jun 13 '16 04:06

Kevin


People also ask

Is generator expression faster than list comprehension?

List comprehensions return the entire list, and the generator expression returns only the generator object. The values will be the same as those in the list, but they will be accessed one at a time by using the next() function. This is what makes list comprehensions faster than generator expressions.

Is generator faster than list?

List comprehensions are usually faster than generator expressions as generator expressions create another layer of overhead to store references for the iterator. However, the performance difference is often quite small.

Which is the difference between generator comprehension and list comprehension?

The only difference between Generator Comprehension and List Comprehension is that the former uses parentheses.

Are list comprehensions memory efficient than generator comprehensions?

So what's the difference between Generator Expressions and List Comprehensions? The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list. Thus we can say that the generator expressions are memory efficient than the lists.


1 Answers

The str.join method converts its iterable parameter to a list if it's not a list or tuple already. This lets the joining logic iterate over the items multiple times (it makes one pass to calculate the size of the result string, then a second pass to actually copy the data).

You can see this in the CPython source code:

PyObject *
PyUnicode_Join(PyObject *separator, PyObject *seq)
{
    /* lots of variable declarations at the start of the function omitted */

    fseq = PySequence_Fast(seq, "can only join an iterable");

    /* ... */
}

The PySequence_Fast function in the C API does just what I described. It converts an arbitrary iterable into a list (essentially by calling list on it), unless it already is a list or tuple.

The conversion of the generator expression to a list means that the usual benefits of generators (a smaller memory footprint and the potential for short-circuiting) don't apply to str.join, and so the (small) additional overhead that the generator has makes its performance worse.

like image 159
Blckknght Avatar answered Sep 17 '22 19:09

Blckknght