I have been playing with the dis
library to disassemble some Python source code, but I see that this does not recurse into functions or classes:
import dis
source_py = "test.py"
with open(source_py) as f_source:
source_code = f_source.read()
byte_code = compile(source_code, source_py, "exec")
dis.dis(byte_code)
All I see are entries such as:
54 456 LOAD_CONST 63 (<code object foo at 022C9458, file "test.py", line 54>)
459 MAKE_FUNCTION 0
462 STORE_NAME 20 (foo)
If the source file had a function foo()
, I could obviously add something like the following to the sourcefile:
dis.dis(foo)
I cannot figure out how to do this without changing the source file and executing it. I would like to be able to extract the pertinent bytes from the compiled byte_code
and pass them to dis.dis()
.
def sub_byte_code(byte_code, function_or_class_name):
sub_byte_code = xxxxxx
dis.dis(sub_byte_code)
I have considered wrapping the source code and executing dis.dis()
as follows but I do not wish to execute the script:
source_code_dis = "import dis\n%s\ndis.dis(foo)\n" % (source_code)
exec(source_code_dis)
Is there perhaps a trick to calling it? e.g. dis.dis(byte_code, recurse=True)
Disassemble Your Python CodeWe can disassemble our code using Python's dis module. This is what disassembling a simple function looks like: We imported the dis module and then called the dis() method to disassemble my_very_special_function . Remember not to include round braces.
By taking Python bytecode that comes distributed with that version of Python and decompiling these. Among those that successfully decompile, we can then make sure the resulting programs are syntactically correct by running the Python interpreter for that bytecode version.
The disassembler converts the byte-compiled code into human-readable form. The byte-code interpreter is implemented as a simple stack machine. It pushes values onto a stack of its own, then pops them off to use them in calculations whose results are themselves pushed back on the stack.
The dis module supports the analysis of CPython bytecode by disassembling it. The CPython bytecode which this module takes as an input is defined in the file Include/opcode. h and used by the compiler and the interpreter.
Late answer but I would have been glad to find it when needed. If you want to fully disassemble a script with functions without importing it, you have to implement the sub_byte_code function mentioned in the question. This is done by scanning byte_code.co_consts to find types.CodeType literals.
The following completes the script from the question:
import dis
import types
source_py = "test.py"
with open(source_py) as f_source:
source_code = f_source.read()
byte_code = compile(source_code, source_py, "exec")
dis.dis(byte_code)
for x in byte_code.co_consts:
if isinstance(x, types.CodeType):
sub_byte_code = x
func_name = sub_byte_code.co_name
print('\nDisassembly of %s:' % func_name)
dis.dis(sub_byte_code)
And the result will be something like that:
1 0 LOAD_CONST 0 (<code object foo at 0x02CB99C0, file "test.py", line 1>)
2 LOAD_CONST 1 ('foo')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (foo)
4 8 LOAD_NAME 0 (foo)
10 LOAD_CONST 2 (42)
12 CALL_FUNCTION 1
14 STORE_NAME 1 (x)
16 LOAD_CONST 3 (None)
18 RETURN_VALUE
Disassembly of foo:
2 0 LOAD_FAST 0 (n)
2 UNARY_NEGATIVE
4 RETURN_VALUE
Edit: starting from python 3.7, dis.dis disassembles functions and does this recursively. dis.dis has a depth
additional argument to control the depth of function definitions to be disassembled.
Import the file as a module and call dis.dis()
on that module.
import dis
import test
dis.dis(test)
You can also do this from the command-line:
python -m dis test.py
Quoting from the documentation for dis.dis
:
For a module, it disassembles all functions.
Edit: As of python 3.7, dis.dis
is recursive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With