Say I have a running CPython session,
Is there a way to run the data (bytes
) from a pyc
file directly?
(without having the data on-disk necessarily, and without having to write a temporary pyc file)
Example script to show a simple use-case:
if foo:
# Intentionally ambiguous, since the data source
# is a detail and answers shouldn't depend this detail.
data = read_data_from_somewhere()
else:
data = open("bar.pyc", 'rb').read()
assert(type(data) is bytes)
code = bytes_to_code(data)
# call a method from the loaded code
code.call_function()
Exact use isn't important, but generating code dynamically and copying over a network to execute is one use-case (for the purpose of thinking about this question).
Here are some example use-cases, which made me curious to know how this can be done:
py extension), Python first compiles it into a bytecode. The bytecode is a low-level platform-independent representation of your source code, however, it is not the binary machine code and cannot be run by the target machine directly.
This is the job of the compiler to translate Python code to bytecode. The compiler stores bytecode in a code object, which is a structure that fully describes what a code block, like a module or a function, does. To execute a code object, CPython first creates a state of execution for it called a frame object.
Byte Code is automatically created in the same directory as . py file, when a module of python is imported for the first time, or when the source is more recent than the current compiled file. Next time, when the program is run, python interpreter use this file to skip the compilation step.
Python doesn't convert its code into machine code, something that hardware can understand. It actually converts it into something called byte code. So within python, compilation happens, but it's just not into a machine language.
Is there a way to run the data from a pyc file directly?
The compiled code object can be saved using marshal
import marshal
bytes = marshal.dumps(eggs)
the bytes can be converted back to a code object
eggs = marshal.loads(bytes)
exec(eggs)
A pyc
file is a marshaled code object with a header
For Python3, the header is 12 bytes which need to be skipped, the remaining data can be read via marshal.loads
.
See Ned Batchelder's blog post:
At the simple level, a .pyc file is a binary file containing only three things:
- A four-byte magic number,
- A four-byte modification timestamp, and
- A marshalled code object.
Note, the link references Python2, but its almost the same in Python3, the pyc
header size is just 12 instead of 8 bytes.
Assuming the platform of the compiled .pyc
is correct, you can just import it. So with a file bar.pyc
in the python path, the following works even if bar.py
does not exist:
import bar
bar.call_function()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With