I'm using symtable
to get the symbol tables of a piece of code. Curiously, when using a comprehension (listcomp, setcomp, etc.), there are some extra symbols I didn't define.
Reproduction (using CPython 3.6):
import symtable
root = symtable.symtable('[x for x in y]', '?', 'exec')
# Print symtable of the listcomp
print(root.get_children()[0].get_symbols())
Output:
[<symbol '.0'>, <symbol '_[1]'>, <symbol 'x'>]
Symbol x
is expected. But what are .0
and _[1]
?
Note that with any other non-comprehension construct I'm getting exactly the identifiers I used in the code. E.g., lambda x: y
only results in the symbols [<symbol 'x'>, <symbol 'y'>]
.
Also, the docs say that symtable.Symbol
is...
An entry in a
SymbolTable
corresponding to an identifier in the source.
...although these identifiers evidently don't appear in the source.
The two names are used to implement list comprehensions as a separate scope, and they have the following meaning:
.0
is an implicit argument, used for the iterable (sourced from y
in your case)._[1]
is a temporary name in the symbol table, used for the target list. This list eventually ends up on the stack.*
A list comprehension (as well as dict and set comprehension and generator expressions) is executed in a new scope. To achieve this, Python effectively creates a new anonymous function.
Because it is a function, really, you need to pass in the iterable you are looping over as an argument. This is what .0
is for, it is the first implicit argument (so at index 0
). The symbol table you produced explicitly lists .0
as an argument:
>>> root = symtable.symtable('[x for x in y]', '?', 'exec')
>>> type(root.get_children()[0])
<class 'symtable.Function'>
>>> root.get_children()[0].get_parameters()
('.0',)
The first child of your table is a function with one argument named .0
.
A list comprehension also needs to build the output list, and that list could be seen as a local too. This is the _[1]
temporary variable. It never actually becomes a named local variable in the code object that is produced; this temporary variable is kept on the stack instead.
You can see the code object produced when you use compile()
:
>>> code_object = compile('[x for x in y]', '?', 'exec')
>>> code_object
<code object <module> at 0x11a4f3ed0, file "?", line 1>
>>> code_object.co_consts[0]
<code object <listcomp> at 0x11a4ea8a0, file "?", line 1>
So there is an outer code object, and in the constants, is another, nested code object. That latter one is the actual code object for the loop. It uses .0
and x
as local variables. It also takes 1 argument; the names for arguments are the first co_argcount
values in the co_varnames
tuple:
>>> code_object.co_consts[0].co_varnames
('.0', 'x')
>>> code_object.co_consts[0].co_argcount
1
So .0
is the argument name here.
The _[1]
temporary variable is handled on the stack, see the disassembly:
>>> import dis
>>> dis.dis(code_object.co_consts[0])
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (x)
8 LOAD_FAST 1 (x)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE
Here we see .0
referenced again. _[1]
is the BUILD_LIST
opcode pushing a list object onto the stack, then .0
is put on the stack for the FOR_ITER
opcode to iterate over (the opcode removes the iterable from .0
from the stack again).
Each iteration result is pushed onto the stack by FOR_ITER
, popped again and stored in x
with STORE_FAST
, then loaded onto the stack again with LOAD_FAST
. Finally LIST_APPEND
takes the top element from the stack, and adds it to the list referenced by the next element on the stack, so to _[1]
.
JUMP_ABSOLUTE
then brings us back to the top of the loop, where we continue to iterate until the iterable is done. Finally, RETURN_VALUE
returns the top of the stack, again _[1]
, to the caller.
The outer code object does the work of loading the nested code object and calling it as a function:
>>> dis.dis(code_object)
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x11a4ea8a0, file "?", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (y)
8 GET_ITER
10 CALL_FUNCTION 1
12 POP_TOP
14 LOAD_CONST 2 (None)
16 RETURN_VALUE
So this makes a function object, with the function named <listcomp>
(helpful for tracebacks), loads y
, produces an iterator for it (the moral equivalent of iter(y)
, and calls the function with that iterator as the argument.
If you wanted to translate that to Psuedo-code, it would look like:
def <listcomp>(.0):
_[1] = []
for x in .0:
_[1].append(x)
return _[1]
<listcomp>(iter(y))
The _[1]
temporary variable is of course not needed for generator expressions:
>>> symtable.symtable('(x for x in y)', '?', 'exec').get_children()[0].get_symbols()
[<symbol '.0'>, <symbol 'x'>]
Instead of appending to a list, the generator expression function object yields the values:
>>> dis.dis(compile('(x for x in y)', '?', 'exec').co_consts[0])
1 0 LOAD_FAST 0 (.0)
>> 2 FOR_ITER 10 (to 14)
4 STORE_FAST 1 (x)
6 LOAD_FAST 1 (x)
8 YIELD_VALUE
10 POP_TOP
12 JUMP_ABSOLUTE 2
>> 14 LOAD_CONST 0 (None)
16 RETURN_VALUE
Together with the outer bytecode, the generator expression is equivalent to:
def <genexpr>(.0):
for x in .0:
yield x
<genexpr>(iter(y))
* The temporary variable is actually no longer needed; they were used in the initial implementation of comprehensions, but this commit from April 2007 moved the compiler to just using the stack, and this has been the norm for all of the 3.x releases, as well as Python 2.7. It still is easier to think of the generated name as a reference to the stack. Because the variable is no longer needed, I filed issue 32836 to have it removed, and Python 3.8 and onwards will no longer include it in the symbol table.
In Python 2.6, you can still see the actual temporary name in the disassembly:
>>> import dis
>>> dis.dis(compile('[x for x in y]', '?', 'exec'))
1 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_NAME 0 (_[1])
7 LOAD_NAME 1 (y)
10 GET_ITER
>> 11 FOR_ITER 13 (to 27)
14 STORE_NAME 2 (x)
17 LOAD_NAME 0 (_[1])
20 LOAD_NAME 2 (x)
23 LIST_APPEND
24 JUMP_ABSOLUTE 11
>> 27 DELETE_NAME 0 (_[1])
30 POP_TOP
31 LOAD_CONST 0 (None)
34 RETURN_VALUE
Note how the name actually has to be deleted again!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With