Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python variables lose scope inside generator?

The code below returns NameError: global name 'self' is not defined. Why?

lengths = [3, 10]
self.fooDict = getOrderedDict(stuff)

if not all(0 < l < len(self.fooDict) for l in lengths):
    raise ValueError("Bad lengths!")

Note that self.fooDict is an OrderedDict (imported from the collections library) that has 35 entries. When I try to debug, the code below executes without error:

(Pdb) len(self.dataDict)
35
(Pdb) all(0 < size < 35 for size in lengths)
True

But the debugginf code below gives me the original error:

(Pdb) baz = len(self.dataDict)
(Pdb) all(0 < size < baz for size in lengths)
NameError: global name 'baz' is not defined
like image 635
BoltzmannBrain Avatar asked Jul 08 '15 16:07

BoltzmannBrain


People also ask

Do python variables go out of scope?

There is actually no block scope in python. Variables may be local (inside of a function) or global (same for the whole scope of the program).

Are variables in IF statements Local python?

In Python, on the other hand, variables declared in if-statements, for-loop blocks, and while-loop blocks are not local variables, and stay in scope outside of the block. Thus we say that C++ has “block-level” scoping, while Python uses only “function-level” scoping.

Does python have scope?

In Python, the concept of scope is closely related to the concept of the namespace. As you've learned so far, a Python scope determines where in your program a name is visible. Python scopes are implemented as dictionaries that map names to objects. These dictionaries are commonly called namespaces.

How is the scope of a statement defined in python?

Introduction to Scope in Python. The scope defines the accessibility of the python object. To access the particular variable in the code, the scope must be defined as it cannot be accessed from anywhere in the program. The particular coding region where variables are visible is known as scope.


1 Answers

Short answer and workaround

You've run into a limitation of the debugger. Expressions entered into the debugger cannot use non-locally scoped values because the debugger cannot create the required closures.

You could instead create a function to run your generator, thus creating a new scope at the same time:

def _test(baz, lengths):
    return all(0 < size < baz for size in lengths)

_test(len(self.dataDict), lengths)

Note that this applies to set and dictionary comprehensions as well, and in Python 3, list comprehensions.

The long answer, why this happens

Generator expressions (and set, dict and Python 3 list comprehensions) run in a new, nested namespace. The name baz in your generator expression is not a local in that namespace, so Python has to find it somewhere else. At compile time Python determines where to source that name from. It'll search from the scopes the compiler has available and if there are no matches, declares the name a global.

Here are two generator expressions to illustrate:

def function(some_iterable):
    gen1 = (var == spam for var in some_iterable)

    ham = 'bar'
    gen2 = (var == ham for var in some_iterable)

    return gen1, gen2

The name spam is not found in the parent scope, so the compiler marks it as a global:

>>> dis.dis(function.__code__.co_consts[1])  # gen1
  2           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                17 (to 23)
              6 STORE_FAST               1 (var)
              9 LOAD_FAST                1 (var)
             12 LOAD_GLOBAL              0 (spam)
             15 COMPARE_OP               2 (==)
             18 YIELD_VALUE         
             19 POP_TOP             
             20 JUMP_ABSOLUTE            3
        >>   23 LOAD_CONST               0 (None)
             26 RETURN_VALUE        

The opcode at index 12 uses LOAD_GLOBAL to load the spam name.

The name ham is found in the function scope, so the compiler generates bytecode to look up the name as a closure from the function. At the same time the name ham is marked as a closure; the variable is treated differently by the code generated for function so that you can still reference it when the function has returned.

>>> dis.dis(function.__code__.co_consts[3])  # gen2
  4           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                17 (to 23)
              6 STORE_FAST               1 (var)
              9 LOAD_FAST                1 (var)
             12 LOAD_DEREF               0 (ham)
             15 COMPARE_OP               2 (==)
             18 YIELD_VALUE         
             19 POP_TOP             
             20 JUMP_ABSOLUTE            3
        >>   23 LOAD_CONST               0 (None)
             26 RETURN_VALUE        
>>> function.__code__.co_cellvars  # closure cells
('ham',)

The name ham is loaded with a LOAD_DEREF opcode, and the function code object has listed that name as a closure. When you disassemble function you'll find, among other bytecode:

>>> dis.dis(function)
  # ....

  4          22 LOAD_CLOSURE             0 (ham)
             25 BUILD_TUPLE              1
             28 LOAD_CONST               3 (<code object <genexpr> at 0x1074a87b0, file "<stdin>", line 4>)
             31 MAKE_CLOSURE             0
             34 LOAD_FAST                0 (some_iterable)
             37 GET_ITER            
             38 CALL_FUNCTION            1
             41 STORE_FAST               2 (gen2)

  # ...

where the LOAD_CLOSURE and MAKE_CLOSURE bytecodes create a closure for ham to be used by the generator code object.

When you run arbitrary expressions in the debugger, the compiler has no access to the namespace you are debugging. More importantly, it cannot alter that namespace to create a closure. Thus, you cannot use anything but globals in your generator expressions.

like image 176
Martijn Pieters Avatar answered Sep 21 '22 03:09

Martijn Pieters