Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible bug in pdb module in Python 3 when using list generators

Tags:

After running this code in Python 3:

import pdb

def foo():
    nums = [1, 2, 3]
    a = 5
    pdb.set_trace()

foo()

The following expressions work:

(Pdb) print(nums)
[1, 2, 3]

(Pdb) print(a)
5

(Pdb) [x for x in nums]
[1, 2, 3]

but the following expression fails:

(Pdb) [x*a for x in nums]
*** NameError: global name 'a' is not defined

The above works fine in Python 2.7.

Is this a bug or I am missing something?

Update: See the new accepted answer. This was indeed a bug (or a problematic design) which has been addressed now by introducing a new command and mode in pdb.

like image 886
Loax Avatar asked Jun 25 '13 06:06

Loax


2 Answers

if you type interact in your [i]pdb session, you get an interactive session, and list comprehensions do work as expected in this mode

source: http://bugs.python.org/msg215963

like image 54
Ben Usman Avatar answered Sep 23 '22 03:09

Ben Usman


It works perfectly fine:

>>> import pdb
>>> def f(seq):
...     pdb.set_trace()
... 
>>> f([1,2,3])
--Return--
> <stdin>(2)f()->None
(Pdb) [x for x in seq]
[1, 2, 3]
(Pdb) [x in seq for x in seq]
[True, True, True]

Without showing what you are actually doing nobody can tell you why in your specific case you got a NameError.


TL;DR In python3 list-comprehensions are actually functions with their own stack frame, and you cannot access the seq variable, which is an argument of test, from inner stack frames. It is instead treated as a global (and, hence, not found).


What you see is the different implementation of list-comprehension in python2 vs python3. In python 2 list-comprehensions are actually a short-hand for the for loop, and you can clearly see this in the bytecode:

>>> def test(): [x in seq for x in seq]
... 
>>> dis.dis(test)
  1           0 BUILD_LIST               0
              3 LOAD_GLOBAL              0 (seq)
              6 GET_ITER            
        >>    7 FOR_ITER                18 (to 28)
             10 STORE_FAST               0 (x)
             13 LOAD_FAST                0 (x)
             16 LOAD_GLOBAL              0 (seq)
             19 COMPARE_OP               6 (in)
             22 LIST_APPEND              2
             25 JUMP_ABSOLUTE            7
        >>   28 POP_TOP             
             29 LOAD_CONST               0 (None)
             32 RETURN_VALUE        

Note how the bytecode contains a FOR_ITER loop. On the other hand, in python3 list-comprehension are actually functions with their own stack frame:

>>> def test(): [x in seq2 for x in seq]
... 
>>> dis.dis(test)
  1           0 LOAD_CONST               1 (<code object <listcomp> at 0xb6fef160, file "<stdin>", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_GLOBAL              0 (seq) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 POP_TOP              
             14 LOAD_CONST               0 (None) 
             17 RETURN_VALUE      

As you can see there is no FOR_ITER here, instead there is a MAKE_FUNCTION and CALL_FUNCTION bytecodes. If we examine the code of the list-comprehension we can understand how the bindings are setup:

>>> test.__code__.co_consts[1]
<code object <listcomp> at 0xb6fef160, file "<stdin>", line 1>
>>> test.__code__.co_consts[1].co_argcount   # it has one argument
1
>>> test.__code__.co_consts[1].co_names      # global variables
('seq2',)
>>> test.__code__.co_consts[1].co_varnames   # local variables
('.0', 'x')

Here .0 is the only argument of the function. x is the local variable of the loop and seq2 is a global variable. Note that .0, the list-comprehension argument, is the iterable obtained from seq, not seq itself. (see the GET_ITER opcode in the output of dis above). This is more clear with a more complex example:

>>> def test():
...     [x in seq for x in zip(seq, a)]
... 
>>> dis.dis(test)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0xb7196f70, file "<stdin>", line 2>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_GLOBAL              0 (zip) 
              9 LOAD_GLOBAL              1 (seq) 
             12 LOAD_GLOBAL              2 (a) 
             15 CALL_FUNCTION            2 
             18 GET_ITER             
             19 CALL_FUNCTION            1 
             22 POP_TOP              
             23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE 
>>> test.__code__.co_consts[1].co_varnames
('.0', 'x')

Here you can see that the only argument to the list-comprehension, always denoted by .0, is the iterable obtained from zip(seq, a). seq and a themselves are not passed to the list-comprehension. Only iter(zip(seq, a)) is passed inside the list-comprehension.

An other observation that we must make is that, when you run pdb, you cannot access the context of the current function from the functions you want to define. For example the following code fails both on python2 and python3:

>>> import pdb
>>> def test(seq): pdb.set_trace()
... 
>>> test([1,2,3])
--Return--
> <stdin>(1)test()->None
(Pdb) def test2(): print(seq)
(Pdb) test2()
*** NameError: global name 'seq' is not defined

It fails because when defining test2 the seq variable is treated as a global variable, but it's actually a local variable inside the test function, hence it isn't accessible.

The behaviour you see is similar to the following scenario:

#python 2 no error
>>> class A(object):
...     x = 1
...     L = [x for _ in range(3)]
... 
>>> 

#python3 error!
>>> class A(object):
...     x = 1
...     L = [x for _ in range(3)]
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in A
  File "<stdin>", line 3, in <listcomp>
NameError: global name 'x' is not defined

The first one doesn't give an error because it is mostly equivalent to:

>>> class A(object):
...     x = 1
...     L = []
...     for _ in range(3): L.append(x)
... 

Since the list-comprehension is "expanded" in the bytecode. In python3 it fails because you are actually defining a function and you cannot access the class scope from a nested function scope:

>>> class A(object):
...     x = 1
...     def test():
...             print(x)
...     test()
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in A
  File "<stdin>", line 4, in test
NameError: global name 'x' is not defined

Note that genexp are implemented as functions on python2, and in fact you see a similar behaviour with them(both on python2 and python3):

>>> import pdb
>>> def test(seq): pdb.set_trace()
... 
>>> test([1,2,3])
--Return--
> <stdin>(1)test()->None
(Pdb) list(x in seq for x in seq)
*** Error in argument: '(x in seq for x in seq)'

Here pdb doesn't give you more details, but the failure happens for the same exact reason.


In conclusion: it's not a bug in pdb but the way python implements scopes. AFAIK changing this to allow what you are trying to do in pdb would require some big changes in how functions are treated and I don't know whether this can be done without modifying the interpreter.


Note that when using nested list-comprehensions, the nested loop is expanded in bytecode like the list-comprehensions in python2:

>>> import dis
>>> def test(): [x + y for x in seq1 for y in seq2]
... 
>>> dis.dis(test)
  1           0 LOAD_CONST               1 (<code object <listcomp> at 0xb71bf5c0, file "<stdin>", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_GLOBAL              0 (seq1) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 POP_TOP              
             14 LOAD_CONST               0 (None) 
             17 RETURN_VALUE         
>>> # The only argument to the listcomp is seq1
>>> import types
>>> func = types.FunctionType(test.__code__.co_consts[1], globals())
>>> dis.dis(func)
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                29 (to 38) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (seq2) 
             15 GET_ITER             
        >>   16 FOR_ITER                16 (to 35) 
             19 STORE_FAST               2 (y) 
             22 LOAD_FAST                1 (x) 
             25 LOAD_FAST                2 (y) 
             28 BINARY_ADD           
             29 LIST_APPEND              3 
             32 JUMP_ABSOLUTE           16 
        >>   35 JUMP_ABSOLUTE            6 
        >>   38 RETURN_VALUE        

As you can see, the bytecode for listcomp has an explicit FOR_ITER over seq2. This explicit FOR_ITER is inside the listcomp function, and thus the restrictions on scopes still apply(e.g. seq2 is loaded as a global).

And in fact we can confirm this using pdb:

>>> import pdb
>>> def test(seq1, seq2): pdb.set_trace()
... 
>>> test([1,2,3], [4,5,6])
--Return--
> <stdin>(1)test()->None
(Pdb) [x + y for x in seq1 for y in seq2]
*** NameError: global name 'seq2' is not defined
(Pdb) [x + y for x in non_existent for y in seq2]
*** NameError: name 'non_existent' is not defined

Note how the NameError is about seq2 and not seq1(which is passed as function argument), and note how changing the first iterable name to something that doesn't exist changes the NameError(which means that in the first case seq1 was passed successfully).

like image 10
Bakuriu Avatar answered Sep 23 '22 03:09

Bakuriu