I have a list and a lambda
function defined as
In [1]: i = lambda x: a[x]
In [2]: alist = [(1, 2), (3, 4)]
Then I try two different methods to calculate a simple sum
First method.
In [3]: [i(0) + i(1) for a in alist]
Out[3]: [3, 7]
Second method.
In [4]: list(i(0) + i(1) for a in alist)
Out[4]: [7, 7]
Both results are unexpectedly different. Why is that happening?
This behaviour has been fixed in python 3. When you use a list comprehension [i(0) + i(1) for a in alist]
you will define a
in its surrounding scope which is accessible for i
. In a new session list(i(0) + i(1) for a in alist)
will throw error.
>>> i = lambda x: a[x]
>>> alist = [(1, 2), (3, 4)]
>>> list(i(0) + i(1) for a in alist)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <genexpr>
File "<stdin>", line 1, in <lambda>
NameError: global name 'a' is not defined
A list comprehension is not a generator: Generator expressions and list comprehensions.
Generator expressions are surrounded by parentheses (“()”) and list comprehensions are surrounded by square brackets (“[]”).
In your example list()
as a class has its own scope of variables and it has access to global variables at most. When you use that, i
will look for a
inside that scope. Try this in new session:
>>> i = lambda x: a[x]
>>> alist = [(1, 2), (3, 4)]
>>> [i(0) + i(1) for a in alist]
[3, 7]
>>> a
(3, 4)
Compare it to this in another session:
>>> i = lambda x: a[x]
>>> alist = [(1, 2), (3, 4)]
>>> l = (i(0) + i(1) for a in alist)
<generator object <genexpr> at 0x10e60db90>
>>> a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
>>> [x for x in l]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <genexpr>
File "<stdin>", line 1, in <lambda>
NameError: global name 'a' is not defined
When you run list(i(0) + i(1) for a in alist)
you will pass a generator (i(0) + i(1) for a in alist)
to the list
class which it will try to convert it to a list in its own scope before return the list. For this generator which has no access inside lambda function, the variable a
has no meaning.
The generator object <generator object <genexpr> at 0x10e60db90>
has lost the variable name a
. Then when list
tries to call the generator, lambda function will throw error for undefined a
.
The behaviour of list comprehensions in contrast with generators also mentioned here:
List comprehensions also "leak" their loop variable into the surrounding scope. This will also change in Python 3.0, so that the semantic definition of a list comprehension in Python 3.0 will be equivalent to list(). Python 2.4 and beyond should issue a deprecation warning if a list comprehension's loop variable has the same name as a variable used in the immediately surrounding scope.
In python3:
>>> i = lambda x: a[x]
>>> alist = [(1, 2), (3, 4)]
>>> [i(0) + i(1) for a in alist]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
File "<stdin>", line 1, in <lambda>
NameError: name 'a' is not defined
Important things to understand here are
generator expressions will be creating function objects internally but list comprehension will not.
they both will bind the loop variable to the values and the loop variables will be in the current scope if they are not already created.
Lets see the byte codes of the generator expression
>>> dis(compile('(i(0) + i(1) for a in alist)', 'string', 'exec'))
1 0 LOAD_CONST 0 (<code object <genexpr> at ...>)
3 MAKE_FUNCTION 0
6 LOAD_NAME 0 (alist)
9 GET_ITER
10 CALL_FUNCTION 1
13 POP_TOP
14 LOAD_CONST 1 (None)
17 RETURN_VALUE
It loads the code object and then it makes it a function. Lets see the actual code object.
>>> dis(compile('(i(0) + i(1) for a in alist)', 'string', 'exec').co_consts[0])
1 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 27 (to 33)
6 STORE_FAST 1 (a)
9 LOAD_GLOBAL 0 (i)
12 LOAD_CONST 0 (0)
15 CALL_FUNCTION 1
18 LOAD_GLOBAL 0 (i)
21 LOAD_CONST 1 (1)
24 CALL_FUNCTION 1
27 BINARY_ADD
28 YIELD_VALUE
29 POP_TOP
30 JUMP_ABSOLUTE 3
>> 33 LOAD_CONST 2 (None)
36 RETURN_VALUE
As you see here, the current value from the iterator is stored in the variable a
. But since we make this a function object, the a
created will be visible only within the generator expression.
But in case of list comprehension,
>>> dis(compile('[i(0) + i(1) for a in alist]', 'string', 'exec'))
1 0 BUILD_LIST 0
3 LOAD_NAME 0 (alist)
6 GET_ITER
>> 7 FOR_ITER 28 (to 38)
10 STORE_NAME 1 (a)
13 LOAD_NAME 2 (i)
16 LOAD_CONST 0 (0)
19 CALL_FUNCTION 1
22 LOAD_NAME 2 (i)
25 LOAD_CONST 1 (1)
28 CALL_FUNCTION 1
31 BINARY_ADD
32 LIST_APPEND 2
35 JUMP_ABSOLUTE 7
>> 38 POP_TOP
39 LOAD_CONST 2 (None)
42 RETURN_VALUE
There is no explicit function creation and the variable a
is created in the current scope. So, a
is leaked in to the current scope.
With this understanding, lets approach your problem.
>>> i = lambda x: a[x]
>>> alist = [(1, 2), (3, 4)]
Now, when you create a list with comprehension,
>>> [i(0) + i(1) for a in alist]
[3, 7]
>>> a
(3, 4)
you can see that a
is leaked to the current scope and it is still bound to the last value from the iteration.
So, when you iterate the generator expression after the list comprehension, the lambda
function uses the leaked a
. That is why you are getting [7, 7]
, since a
is still bound to (3, 4)
.
But, if you iterate the generator expression first, then the a
will be bound to the values from alist
and will not be leaked to the current scope as generator expression becomes a function. So, when the lambda
function tries to access a
, it couldn't find it anywhere. That is why it fails with the error.
Note: The same behaviour cannot be observed in Python 3.x, because the leaking is prevented by creating functions for list comprehensions as well. You might want to read more about this in the History of Python blog's post, From List Comprehensions to Generator Expressions, written by Guido himself.
See my other answer for a workaround. But thinking a bit more about, the problem seems to be a bit more complex. I think there are several issues going on here:
When you do i = lambda x: a[x]
, the variable a
is not a parameter
to the function, this is called a
closure. This is the same for both lambda expressions and normal function definitions.
Python apparently does 'late binding', which means that the value of the variables you closed over are only looked up at the moment you call the function. This can lead to various unexpected results.
In Python 2, there is a difference between list comprehensions, which leak their loop variable, and generator expressions, in which the loop variable does not leak (see this PEP for details). This difference has been removed in Python 3, where a list comprehension is a shortcut for list(generater_expression)
. I am not sure, but this probably means that Python2 list comprehensions execute in their outer scope, while generator expressions and Python3 list comprehensions create their own inner scope.
Demonstration (in Python2):
In [1]: def f(): # closes over a from global scope
...: return 2 * a
...:
In [2]: list(f() for a in range(5)) # does not find a in global scope
[...]
NameError: global name 'a' is not defined
In [3]: [f() for a in range(5)]
# executes in global scope, so f finds a. Also leaks a=8
Out[3]: [0, 2, 4, 6, 8]
In [4]: list(f() for a in range(5)) # finds a=8 in global scope
Out[4]: [8, 8, 8, 8, 8]
In Python3:
In [1]: def f():
...: return 2 * a
...:
In [2]: list(f() for a in range(5))
# does not find a in global scope, does not leak a
[...]
NameError: name 'a' is not defined
In [3]: [f() for a in range(5)]
# does not find a in global scope, does not leak a
[...]
NameError: name 'a' is not defined
In [4]: list(f() for a in range(5)) # a still undefined
[...]
NameError: name 'a' is not defined
a
is in global scope.
So it should give error
Solution is:
i = lambda a, x: a[x]
After [i(0) + i(1) for a in alist]
is executed, a
becomes (3,4)
.
Then when the below line is executed:
list(i(0) + i(1) for a in alist)
(3,4)
value is used both time by the lambda function i
as the value of a
, so it prints [7,7].
Instead you should define your lambda functions having two parameters a
and x
.
i = lambda a,x : a[x]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With