I am having a lot of difficulty understanding Python's bytecode and its dis
module.
import dis
def func():
x = 1
dis.dis(func)
The above code when typed in the interpreter produces the following output:
0 LOAD_CONST 1(1)
3 STORE_FAST 0(x)
6 LOAD_CONST 0(NONE)
9 RETURN_VALUE
E.g.:
What is the meaning of LOAD_CONST
, STORE_FAST
and the numbers like 0
, 3
, 6
and 9
?
A specific resource, where I can find this information would be much appreciated.
Bytecode is the low-level representation of the python code which is the platform-independent, but the code is not the binary code and so it cannot run directly on the targeted machine. It is a set of instructions for the virtual machine which is also called as the Python Virtual Machine[PVM].
The bytecode is a low-level platform-independent representation of your source code, however, it is not the binary machine code and cannot be run by the target machine directly. In fact, it is a set of instructions for a virtual machine which is called the Python Virtual Machine (PVM).
Decompyle is a python disassembler and decompiler which converts Python byte-code (. pyc or . pyo) back into equivalent Python source. Verification of the produced code (re-compiled) is avaliable as well.
The numbers before the bytecodes are offsets into the original binary bytecodes:
>>> func.__code__.co_code
'd\x01\x00}\x00\x00d\x00\x00S'
Some bytecodes come with additional information (arguments) that influence how each bytecode works, the offset tells you at what position in the bytestream the bytecode was found.
The LOAD_CONST
bytecode (ASCII d
, hex 64) is followed by two additional bytes encoding a reference to a constant associated with the bytecode, for example. As a result, the STORE_FAST
opcode (ASCII }
, hex 7D) is found at index 3.
The dis
module documentation lists what each instruction means. For LOAD_CONST
, it says:
Pushes
co_consts[consti]
onto the stack.
which refers to the co_consts
structure that is always present with a code object; the compiler constructs that:
>>> func.__code__.co_consts
(None, 1)
The opcode loads index 1 from that structure (the 01 00 bytes in the bytecode encode a 1), and dis
has looked that up for you; it is the value 1
.
The next instruction, STORE_FAST
is described as:
Stores TOS into the local
co_varnames[var_num]
.
Here TOS refers to Top Of Stack; note that the LOAD_CONST
just pushed something onto the stack, the 1
value. co_varnames
is another structure; it references local variable names, the opcode references index 0:
>>> func.__code__.co_varnames
('x',)
dis
looked that up too, and the name you used in your code is x
. Thus, this opcode stored 1
into x
.
Another LOAD_CONST
loads None
onto the stack from index 0, followed by RETURN_VALUE
:
Returns with TOS to the caller of the function.
so this instruction takes the top of the stack (with the None
constant) and returns from this code block. None
is the default return value for functions without an explicit return
statement.
You omitted something from the dis
output, the line numbers:
>>> dis.dis(func)
2 0 LOAD_CONST 1 (1)
3 STORE_FAST 0 (x)
6 LOAD_CONST 0 (None)
9 RETURN_VALUE
Note the 2
on the first line; that's the line number in the original source that contains the Python code that was used for these instructions. Python code objects have co_lnotab
and co_firstlineno
attributes that let you map bytecodes back to line numbers in the original source. dis
does this for you when displaying a disassembly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With