I knew that a .pyc
file is generated by the python interpreter and contains the byte code as this question said.
I thought python interpreter is using the time stamp to detect whether a .pyc
is newer than a .py
, and if it is, skipped compiling it again when executing. (The way what makefile do)
So, I did a test, but it seemed I was wrong.
t.py
contains print '123'
and t1.py
contains import
t
. Running command python t1.py
gave the output 123
and
generated t.pyc
, all as expected.t.py
as print '1234'
and updated the time stamp of
t.pyc
by using touch t.pyc
.python t1.py
again, I thought I would get 123
but 1234
indeed. So it seemed the python interpreter still knew that t.py
is updated.Then I wondered whether python interpreter will compile and generate t.pyc
every time running python t1.py
. But when I run python t1.py
several times, I found that the t.pyc
will not be updated when t.py
is not updated.
So, my question is: how python interpreter knows when to compile and update a .pyc
file?
Updated
Since python interpreter is using the timestamp stored in the .pyc
file. I think it a record of when .pyc
was last updated. And when imported, compare it with the timestamp of .py
file.
So I tried to hack it in this way: change the OS time to an older one, and edit .py
file.
I thought when imported again, the .py
seems older than the .pyc
, and the python interpreter will not update .pyc
. But I was wrong again.
So, does the python interpreter compare these two timestamp not in a older or newer way but in a exactly equal way?
In a exectly equal way, I means the timestamp in .pyc
records the when the .py
was last modified. When imported, it compares the timestamp with the current timestamp of .py
, if it's not the same, recompile and update .pyc
.
pyc files are created by the Python interpreter when a . py file is imported. They contain the "compiled bytecode" of the imported module/program so that the "translation" from source code to bytecode (which only needs to be done once) can be skipped on subsequent imports if the . pyc is newer than the corresponding .
The py_compile module provides a function to generate a byte-code file from a source file, and another function used when the module source file is invoked as a script.
When a Python source file (module) is imported during an execution for the first time, the appropriate . pyc file is created automatically. If the same module is imported again, then the already created . pyc file is used.
py files contain the source code of a program. Whereas, . pyc file contains the bytecode of your program.
It looks like the timestamp is stored directly in the *.pyc
file. The python interpreter doesn't rely on the last modification attribute of the file, maybe to avoid incompatibe bytecode issues when copying source trees.
Looking at the python implementation of the import
statement, you can find the stale check in _validate_bytecode_header()
. By the looks of it, it extracts bytes 4 to 7 (incl) and compares it against the timecode of the source file. If those doesn't match, the bytecode is considered stalled and thus recompiled.
In the process, it also checks the length of the source file against the length of the source used to generate a given bytecode (stored in bytes 8 to 11).
In the python implementation, if one of those checks fails, the bytecode loader raises an ImportError
catched by SourceLoader.get_code()
that triggers a recompilation of the bytecode.
Note: That's how it's done in the python version of importlib
. I guess there's no functionnal difference in the native version, but my C is a bit too rusty to dig into compiler code
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With