Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are list comprehensions syntactic sugar for `list(generator expression)` in Python 3?

In Python 3, is a list comprehension simply syntactic sugar for a generator expression fed into the list function?

e.g. is the following code:

squares = [x**2 for x in range(1000)] 

actually converted in the background into the following?

squares = list(x**2 for x in range(1000)) 

I know the output is identical, and Python 3 fixes the surprising side-effects to surrounding namespaces that list comprehensions had, but in terms of what the CPython interpreter does under the hood, is the former converted to the latter, or are there any difference in how the code gets executed?

Background

I found this claim of equivalence in the comments section to this question, and a quick google search showed the same claim being made here.

There was also some mention of this in the What's New in Python 3.0 docs, but the wording is somewhat vague:

Also note that list comprehensions have different semantics: they are closer to syntactic sugar for a generator expression inside a list() constructor, and in particular the loop control variables are no longer leaked into the surrounding scope.

like image 786
zehnpaard Avatar asked May 07 '15 08:05

zehnpaard


People also ask

Are list comprehensions generators?

List comprehensions and generators are not different at all; they are just different ways of writing the same thing. A list comprehension produces a list as output, a generator produces a generator object.

What are list comprehensions in Python?

List comprehension is an elegant way to define and create lists based on existing lists. List comprehension is generally more compact and faster than normal functions and loops for creating list.

Are list comprehensions memory efficient than generator comprehensions?

So what's the difference between Generator Expressions and List Comprehensions? The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list. Thus we can say that the generator expressions are memory efficient than the lists.

Is list comprehension possible in Python?

List comprehension in Python is an easy and compact syntax for creating a list from a string or another list. It is a very concise way to create a new list by performing an operation on each item in the existing list. List comprehension is considerably faster than processing a list using the for loop.


2 Answers

Both work differently. The list comprehension version takes advantage of the special bytecode LIST_APPEND which calls PyList_Append directly for us. Hence it avoids an attribute lookup to list.append and a function call at the Python level.

>>> def func_lc():     [x**2 for x in y] ... >>> dis.dis(func_lc)   2           0 LOAD_CONST               1 (<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>)               3 LOAD_CONST               2 ('func_lc.<locals>.<listcomp>')               6 MAKE_FUNCTION            0               9 LOAD_GLOBAL              0 (y)              12 GET_ITER              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)              16 POP_TOP              17 LOAD_CONST               0 (None)              20 RETURN_VALUE  >>> lc_object = list(dis.get_instructions(func_lc))[0].argval >>> lc_object <code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2> >>> dis.dis(lc_object)   2           0 BUILD_LIST               0               3 LOAD_FAST                0 (.0)         >>    6 FOR_ITER                16 (to 25)               9 STORE_FAST               1 (x)              12 LOAD_FAST                1 (x)              15 LOAD_CONST               0 (2)              18 BINARY_POWER              19 LIST_APPEND              2              22 JUMP_ABSOLUTE            6         >>   25 RETURN_VALUE 

On the other hand the list() version simply passes the generator object to list's __init__ method which then calls its extend method internally. As the object is not a list or tuple, CPython then gets its iterator first and then simply adds the items to the list until the iterator is exhausted:

>>> def func_ge():     list(x**2 for x in y) ... >>> dis.dis(func_ge)   2           0 LOAD_GLOBAL              0 (list)               3 LOAD_CONST               1 (<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>)               6 LOAD_CONST               2 ('func_ge.<locals>.<genexpr>')               9 MAKE_FUNCTION            0              12 LOAD_GLOBAL              1 (y)              15 GET_ITER              16 CALL_FUNCTION            1 (1 positional, 0 keyword pair)              19 CALL_FUNCTION            1 (1 positional, 0 keyword pair)              22 POP_TOP              23 LOAD_CONST               0 (None)              26 RETURN_VALUE >>> ge_object = list(dis.get_instructions(func_ge))[1].argval >>> ge_object <code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2> >>> dis.dis(ge_object)   2           0 LOAD_FAST                0 (.0)         >>    3 FOR_ITER                15 (to 21)               6 STORE_FAST               1 (x)               9 LOAD_FAST                1 (x)              12 LOAD_CONST               0 (2)              15 BINARY_POWER              16 YIELD_VALUE              17 POP_TOP              18 JUMP_ABSOLUTE            3         >>   21 LOAD_CONST               1 (None)              24 RETURN_VALUE >>> 

Timing comparisons:

>>> %timeit [x**2 for x in range(10**6)] 1 loops, best of 3: 453 ms per loop >>> %timeit list(x**2 for x in range(10**6)) 1 loops, best of 3: 478 ms per loop >>> %%timeit out = [] for x in range(10**6):     out.append(x**2) ... 1 loops, best of 3: 510 ms per loop 

Normal loops are slightly slow due to slow attribute lookup. Cache it and time again.

>>> %%timeit out = [];append=out.append for x in range(10**6):     append(x**2) ... 1 loops, best of 3: 467 ms per loop 

Apart from the fact that list comprehension don't leak the variables anymore one more difference is that something like this is not valid anymore:

>>> [x**2 for x in 1, 2, 3] # Python 2 [1, 4, 9] >>> [x**2 for x in 1, 2, 3] # Python 3   File "<ipython-input-69-bea9540dd1d6>", line 1     [x**2 for x in 1, 2, 3]                     ^ SyntaxError: invalid syntax  >>> [x**2 for x in (1, 2, 3)] # Add parenthesis [1, 4, 9] >>> for x in 1, 2, 3: # Python 3: For normal loops it still works     print(x**2) ... 1 4 9 
like image 65
Ashwini Chaudhary Avatar answered Sep 28 '22 07:09

Ashwini Chaudhary


Both forms create and call an anonymous function. However, the list(...) form creates a generator function and passes the returned generator-iterator to list, while with the [...] form, the anonymous function builds the list directly with LIST_APPEND opcodes.

The following code gets decompilation output of the anonymous functions for an example comprehension and its corresponding genexp-passed-to-list:

import dis  def f():     [x for x in []]  def g():     list(x for x in [])  dis.dis(f.__code__.co_consts[1]) dis.dis(g.__code__.co_consts[1]) 

The output for the comprehension is

  4           0 BUILD_LIST               0               3 LOAD_FAST                0 (.0)         >>    6 FOR_ITER                12 (to 21)               9 STORE_FAST               1 (x)              12 LOAD_FAST                1 (x)              15 LIST_APPEND              2              18 JUMP_ABSOLUTE            6         >>   21 RETURN_VALUE 

The output for the genexp is

  7           0 LOAD_FAST                0 (.0)         >>    3 FOR_ITER                11 (to 17)               6 STORE_FAST               1 (x)               9 LOAD_FAST                1 (x)              12 YIELD_VALUE              13 POP_TOP              14 JUMP_ABSOLUTE            3         >>   17 LOAD_CONST               0 (None)              20 RETURN_VALUE 
like image 25
user2357112 supports Monica Avatar answered Sep 28 '22 05:09

user2357112 supports Monica