Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible to execute Python bytecode from a script?

Say I have a running CPython session,

Is there a way to run the data (bytes) from a pyc file directly? (without having the data on-disk necessarily, and without having to write a temporary pyc file)

Example script to show a simple use-case:

if foo:
    # Intentionally ambiguous, since the data source
    # is a detail and answers shouldn't depend this detail.
    data = read_data_from_somewhere()
else:
    data = open("bar.pyc", 'rb').read()

assert(type(data) is bytes)

code = bytes_to_code(data)

# call a method from the loaded code
code.call_function()

Exact use isn't important, but generating code dynamically and copying over a network to execute is one use-case (for the purpose of thinking about this question).


Here are some example use-cases, which made me curious to know how this can be done:

  • Checking Python scripts for malicious code.
    If a single command can access a larger body of code hidden in binary data, what would that command look like?
  • Dynamically generate code and cache it for re-use (not necessarily on disk, could use a data-base for example).
  • Ability to send pre-compiled byte-code to a process, control an application which embeds Python for eg.
like image 977
ideasman42 Avatar asked Apr 27 '15 13:04

ideasman42


People also ask

Can you run Python bytecode?

py extension), Python first compiles it into a bytecode. The bytecode is a low-level platform-independent representation of your source code, however, it is not the binary machine code and cannot be run by the target machine directly.

How is Python bytecode executed?

This is the job of the compiler to translate Python code to bytecode. The compiler stores bytecode in a code object, which is a structure that fully describes what a code block, like a module or a function, does. To execute a code object, CPython first creates a state of execution for it called a frame object.

How is Python code converted to byte code?

Byte Code is automatically created in the same directory as . py file, when a module of python is imported for the first time, or when the source is more recent than the current compiled file. Next time, when the program is run, python interpreter use this file to skip the compilation step.

Can Python be compiled to machine code?

Python doesn't convert its code into machine code, something that hardware can understand. It actually converts it into something called byte code. So within python, compilation happens, but it's just not into a machine language.


2 Answers

Is there a way to run the data from a pyc file directly?

The compiled code object can be saved using marshal

import marshal
bytes = marshal.dumps(eggs)

the bytes can be converted back to a code object

eggs = marshal.loads(bytes)
exec(eggs)

A pyc file is a marshaled code object with a header

For Python3, the header is 12 bytes which need to be skipped, the remaining data can be read via marshal.loads.


See Ned Batchelder's blog post:

At the simple level, a .pyc file is a binary file containing only three things:

  • A four-byte magic number,
  • A four-byte modification timestamp, and
  • A marshalled code object.

Note, the link references Python2, but its almost the same in Python3, the pyc header size is just 12 instead of 8 bytes.

like image 163
Thomas Avatar answered Sep 28 '22 04:09

Thomas


Assuming the platform of the compiled .pyc is correct, you can just import it. So with a file bar.pyc in the python path, the following works even if bar.py does not exist:

import bar
bar.call_function()
like image 29
poke Avatar answered Sep 28 '22 03:09

poke