Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fully disassemble Python source

I have been playing with the dis library to disassemble some Python source code, but I see that this does not recurse into functions or classes:

import dis

source_py = "test.py"

with open(source_py) as f_source:
    source_code = f_source.read()

byte_code = compile(source_code, source_py, "exec")
dis.dis(byte_code)

All I see are entries such as:

 54         456 LOAD_CONST              63 (<code object foo at 022C9458, file "test.py", line 54>)
            459 MAKE_FUNCTION            0
            462 STORE_NAME              20 (foo)

If the source file had a function foo(), I could obviously add something like the following to the sourcefile:

dis.dis(foo)

I cannot figure out how to do this without changing the source file and executing it. I would like to be able to extract the pertinent bytes from the compiled byte_code and pass them to dis.dis().

def sub_byte_code(byte_code, function_or_class_name):
    sub_byte_code = xxxxxx
    dis.dis(sub_byte_code)

I have considered wrapping the source code and executing dis.dis() as follows but I do not wish to execute the script:

source_code_dis = "import dis\n%s\ndis.dis(foo)\n" % (source_code)
exec(source_code_dis)

Is there perhaps a trick to calling it? e.g. dis.dis(byte_code, recurse=True)

like image 264
Martin Evans Avatar asked Aug 13 '15 13:08

Martin Evans


People also ask

Can you disassemble Python code?

Disassemble Your Python CodeWe can disassemble our code using Python's dis module. This is what disassembling a simple function looks like: We imported the dis module and then called the dis() method to disassemble my_very_special_function . Remember not to include round braces.

Can Python bytecode be decompiled?

By taking Python bytecode that comes distributed with that version of Python and decompiling these. Among those that successfully decompile, we can then make sure the resulting programs are syntactically correct by running the Python interpreter for that bytecode version.

What is disassembled bytecode?

The disassembler converts the byte-compiled code into human-readable form. The byte-code interpreter is implemented as a simple stack machine. It pushes values onto a stack of its own, then pops them off to use them in calculations whose results are themselves pushed back on the stack.

What is dis library in Python?

The dis module supports the analysis of CPython bytecode by disassembling it. The CPython bytecode which this module takes as an input is defined in the file Include/opcode. h and used by the compiler and the interpreter.


2 Answers

Late answer but I would have been glad to find it when needed. If you want to fully disassemble a script with functions without importing it, you have to implement the sub_byte_code function mentioned in the question. This is done by scanning byte_code.co_consts to find types.CodeType literals.

The following completes the script from the question:

import dis
import types

source_py = "test.py"

with open(source_py) as f_source:
    source_code = f_source.read()

byte_code = compile(source_code, source_py, "exec")
dis.dis(byte_code)

for x in byte_code.co_consts:
    if isinstance(x, types.CodeType):
        sub_byte_code = x
        func_name = sub_byte_code.co_name
        print('\nDisassembly of %s:' % func_name)
        dis.dis(sub_byte_code)

And the result will be something like that:

  1           0 LOAD_CONST               0 (<code object foo at 0x02CB99C0, file "test.py", line 1>)
              2 LOAD_CONST               1 ('foo')
              4 MAKE_FUNCTION            0
              6 STORE_NAME               0 (foo)

  4           8 LOAD_NAME                0 (foo)
             10 LOAD_CONST               2 (42)
             12 CALL_FUNCTION            1
             14 STORE_NAME               1 (x)
             16 LOAD_CONST               3 (None)
             18 RETURN_VALUE

Disassembly of foo:
  2           0 LOAD_FAST                0 (n)
              2 UNARY_NEGATIVE
              4 RETURN_VALUE

Edit: starting from python 3.7, dis.dis disassembles functions and does this recursively. dis.dis has a depth additional argument to control the depth of function definitions to be disassembled.

like image 114
Gilles Arcas Avatar answered Oct 05 '22 10:10

Gilles Arcas


Import the file as a module and call dis.dis() on that module.

import dis
import test

dis.dis(test)

You can also do this from the command-line:

python -m dis test.py

Quoting from the documentation for dis.dis:

For a module, it disassembles all functions.

Edit: As of python 3.7, dis.dis is recursive.

like image 30
Roland Smith Avatar answered Oct 05 '22 11:10

Roland Smith