Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

listcomp unable to access locals defined in code called by exec if nested in function

Are there any python gurus out there able to explain why this code doesn't work :

def f(code_str):
    exec(code_str)

code = """
g = 5
x = [g for i in range(5)]
"""

f(code)

Error:

Traceback (most recent call last):
  File "py_exec_test.py", line 9, in <module>
    f(code)
  File "py_exec_test.py", line 2, in f
    exec(code_str)
  File "<string>", line 3, in <module>
  File "<string>", line 3, in <listcomp>
NameError: name 'g' is not defined

while this one works fine:

code = """
g = 5
x = [g for i in range(5)]
"""

exec(code)

I know it has something to do with locals and globals, as if I pass the exec function the locals and globals from my main scope it works fine, but I don't exactly understand what is going on.

Could it be a bug with Cython?

EDIT: Tried this with python 3.4.0 and python 3.4.3

like image 628
levesque Avatar asked Oct 01 '15 19:10

levesque


2 Answers

The problem is because the list comprehension is closureless in the exec().

When you make a function (in this case a list comprehension) outside of an exec(), the parser builds a tuple with the free variables (the variables used by a code block but not defined by it, ie. g in your case). This tuple is called the function's closure. It is kept in the __closure__ member of the function.

When in the exec(), the parser won't build a closure on the list comprehension and instead tries by default to look into the globals() dictionary. This is why adding global g at the beginning of the code will work (as well as globals().update(locals())).

Using the exec() in its two parameter version will also solve the problem: Python will merge the globals() and locals() dictionary in a single one (as per the documentation). When an assignation is performed, it is done in the globals and locals at the same time. Since Python will check in the globals, this approach will work.

Here's another view on the problem:

import dis

code = """
g = 5
x = [g for i in range(5)]
"""

a = compile(code, '<test_module>', 'exec')
dis.dis(a)
print("###")
dis.dis(a.co_consts[1])

This code produces this bytecode:

  2           0 LOAD_CONST               0 (5)
              3 STORE_NAME               0 (g)

  3           6 LOAD_CONST               1 (<code object <listcomp> at 0x7fb1b22ceb70, file "<boum>", line 3>)
              9 LOAD_CONST               2 ('<listcomp>')
             12 MAKE_FUNCTION            0
             15 LOAD_NAME                1 (range)
             18 LOAD_CONST               0 (5)
             21 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             24 GET_ITER
             25 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             28 STORE_NAME               2 (x)
             31 LOAD_CONST               3 (None)
             34 RETURN_VALUE
###
  3           0 BUILD_LIST               0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                12 (to 21)
              9 STORE_FAST               1 (i)
             12 LOAD_GLOBAL              0 (g)      <---- THIS LINE
             15 LIST_APPEND              2
             18 JUMP_ABSOLUTE            6
        >>   21 RETURN_VALUE

Notice how it performs a LOAD_GLOBAL to load g at the end.

Now, if you have this code instead:

def Foo():
    a = compile(code, '<boum>', 'exec')
    dis.dis(a)
    print("###")
    dis.dis(a.co_consts[1])
    exec(code)

Foo()

This will provide exactly the same bytecode, which is problematic: since we're in a function, g won't be declared in the global variable, but in the locals of the function. But Python tries to search it in the global variables (with LOAD_GLOBAL)!

This is what the interpreter does outside of exec():

def Bar():
    g = 5
    x = [g for i in range(5)]

dis.dis(Bar)
print("###")
dis.dis(Bar.__code__.co_consts[2])

This code gives us this bytecode:

30           0 LOAD_CONST               1 (5)
             3 STORE_DEREF              0 (g)

31           6 LOAD_CLOSURE             0 (g)
              9 BUILD_TUPLE              1
             12 LOAD_CONST               2 (<code object <listcomp> at 0x7fb1b22ae030, file "test.py", line 31>)
             15 LOAD_CONST               3 ('Bar.<locals>.<listcomp>')
             18 MAKE_CLOSURE             0
             21 LOAD_GLOBAL              0 (range)
             24 LOAD_CONST               1 (5)
             27 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             30 GET_ITER
             31 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             34 STORE_FAST               0 (x)
             37 LOAD_CONST               0 (None)
             40 RETURN_VALUE
###
 31           0 BUILD_LIST               0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                12 (to 21)
              9 STORE_FAST               1 (i)
             12 LOAD_DEREF               0 (g)      <---- THIS LINE
             15 LIST_APPEND              2
             18 JUMP_ABSOLUTE            6
        >>   21 RETURN_VALUE

As you can see, g is loaded using LOAD_DEREF, available in the tuple generated in the BUILD_TUPLE, that loaded the variable g using LOAD_CLOSURE. The MAKE_CLOSURE statement creates a function, just like MAKE_FUNCTION seen earlier, but with a closure.

Here's my guess on the reason it is this this way: The closures are created when needed when the module is read the first time. When exec() is executed, it is not able to realize the functions defined within its executed code needs closure. For him, the code in its string that doesn't begin with an indentation is in the global scope. The only way to know if he was invoked in a way that requires a closure would require exec() to inspect the current scope (which seems pretty hackish to me).

This is indeed an obscure behavior which may be explained but certainly raises some eyebrows when it happens. It is a side-effect well explained in the Python guide, though it is hard to understand why it applies to this particular case.

All my analysis was made on Python 3, I have not tried anything on Python 2.

like image 97
Soravux Avatar answered Nov 19 '22 19:11

Soravux


EDIT 2

As other commenters have noticed, you appear to have found a bug in Python 3 (doesn't happen for me in 2.7).

As discussed in the comments below this answer, the original code:

def f(code_str):
    exec(code_str)

is functionally equivalent to:

def f(code_str):
    exec(code_str, globals(), locals())

On my machine, running 3.4 it is functionally equivalent to the extent that it will blow up just the same. The bug here has to do with running the list comprehension while having two mapping objects. For example:

def f(code_str):
    exec(code_str, globals(), {})

will also fail with the same exception.

To avoid provoking this bug, you have to pass exactly one mapping object (because not passing any is equivalent to passing two), and to insure that it works under all cases, you should never pass a function's locals() as that mapping object.

The rest of this answer was written before I realized behavior was different under 3. I'm leaving it, because it's still good advice and gives some insights into exec behavior.

You should never directly alter a function's locals() dictionary. That messes with optimized lookups. See, e.g. this question and its answers

In particular, as the Python doc explains:

The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter.

Because you called exec() from within a function and didn't explicitly pass in locals(), you modified the function's locals, and as the doc explains, that doesn't always work.

So the Pythonic way, as others have pointed out, is to explicitly pass mapping objects to exec().

Python 2.7

When is it OK to modify locals()? One answer is when you are building a class -- at that point it is merely another dictionary:

code = """
g = 5
x = [g for i in range(5)]
"""

class Foo(object):
    exec(code)

print Foo.x, Foo.g

[5, 5, 5, 5, 5] 5

EDIT -- Python 3 As others point out, there appears to be a bug with the locals() here, independent of whether you are inside a function. You can work around this by only passing a single parameter for the globals. The Python documentation explains that if you only pass a single dict, that will be used for both global and local accesses (it's really the same thing as if your code is not executing in a function or class definition -- there is no locals()). So the bug related to locals() does not appear in this case.

The class example above would be:

code = """
g = 5
x = [g for i in range(5)]
"""

class Foo(object):
    exec(code, vars())

print(Foo.x, Foo.g)
like image 2
Patrick Maupin Avatar answered Nov 19 '22 19:11

Patrick Maupin