I knew that a <code>.pyc</code> file is generated by the python interpreter and contains the byte code as this question said. I thought python interpreter is using the time stamp to detect whether a <code>.pyc</code> is newer than a <code>.py</code>, and if it is, skipped compiling it again when executing. (The way what makefile do) So, I did a test, but it seemed I was wrong. <ol> <li>I wrote <code>t.py</code> contains <code>print '123'</code> and <code>t1.py</code> contains <code>import t</code>. Running command <code>python t1.py</code> gave the output <code>123</code> and generated <code>t.pyc</code>, all as expected.</li> <li>Then I edited <code>t.py</code> as <code>print '1234'</code> and updated the time stamp of <code>t.pyc</code> by using <code>touch t.pyc</code>.</li> <li>Run <code>python t1.py</code> again, I thought I would get <code>123</code> but <code>1234</code> indeed. So it seemed the python interpreter still knew that <code>t.py</code> is updated.</li> </ol> Then I wondered whether python interpreter will compile and generate <code>t.pyc</code> every time running <code>python t1.py</code>. But when I run <code>python t1.py</code> several times, I found that the <code>t.pyc</code> will not be updated when <code>t.py</code> is not updated. So, my question is: how python interpreter knows when to compile and update a <code>.pyc</code> file? Updated Since python interpreter is using the timestamp stored in the <code>.pyc</code> file. I think it a record of when <code>.pyc</code> was last updated. And when imported, compare it with the timestamp of <code>.py</code> file. So I tried to hack it in this way: change the OS time to an older one, and edit <code>.py</code> file. I thought when imported again, the <code>.py</code> seems older than the <code>.pyc</code>, and the python interpreter will not update <code>.pyc</code>. But I was wrong again. So, does the python interpreter compare these two timestamp not in a older or newer way but in a exactly equal way? In a exectly equal way, I means the timestamp in <code>.pyc</code> records the when the <code>.py</code> was last modified. When imported, it compares the timestamp with the current timestamp of <code>.py</code>, if it's not the same, recompile and update <code>.pyc</code>.

It looks like the timestamp is stored directly in the <code>*.pyc</code> file. The python interpreter doesn't rely on the last modification attribute of the file, maybe to avoid incompatibe bytecode issues when copying source trees. Looking at the python implementation of the <code>import</code> statement, you can find the stale check in <code>_validate_bytecode_header()</code>. By the looks of it, it extracts bytes 4 to 7 (incl) and compares it against the timecode of the source file. If those doesn't match, the bytecode is considered stalled and thus recompiled. In the process, it also checks the length of the source file against the length of the source used to generate a given bytecode (stored in bytes 8 to 11). In the python implementation, if one of those checks fails, the bytecode loader raises an <code>ImportError</code> catched by <code>SourceLoader.get_code()</code> that triggers a recompilation of the bytecode. Note: That's how it's done in the python version of <code>importlib</code>. I guess there's no functionnal difference in the native version, but my C is a bit too rusty to dig into compiler code

How does the python interpreter know when to compile and update a .pyc file?

Tags:

python

I knew that a .pyc file is generated by the python interpreter and contains the byte code as this question said.

I thought python interpreter is using the time stamp to detect whether a .pyc is newer than a .py, and if it is, skipped compiling it again when executing. (The way what makefile do)

So, I did a test, but it seemed I was wrong.

I wrote t.py contains print '123' and t1.py contains import t. Running command python t1.py gave the output 123 and generated t.pyc, all as expected.
Then I edited t.py as print '1234' and updated the time stamp of t.pyc by using touch t.pyc.
Run python t1.py again, I thought I would get 123 but 1234 indeed. So it seemed the python interpreter still knew that t.py is updated.

Then I wondered whether python interpreter will compile and generate t.pyc every time running python t1.py. But when I run python t1.py several times, I found that the t.pyc will not be updated when t.py is not updated.

So, my question is: how python interpreter knows when to compile and update a .pyc file?

Updated

Since python interpreter is using the timestamp stored in the .pyc file. I think it a record of when .pyc was last updated. And when imported, compare it with the timestamp of .py file.

So I tried to hack it in this way: change the OS time to an older one, and edit .py file. I thought when imported again, the .py seems older than the .pyc, and the python interpreter will not update .pyc. But I was wrong again.

So, does the python interpreter compare these two timestamp not in a older or newer way but in a exactly equal way?

In a exectly equal way, I means the timestamp in .pyc records the when the .py was last modified. When imported, it compares the timestamp with the current timestamp of .py, if it's not the same, recompile and update .pyc.

965

asked May 21 '14 06:05

WKPlus

1 Answers

It looks like the timestamp is stored directly in the *.pyc file. The python interpreter doesn't rely on the last modification attribute of the file, maybe to avoid incompatibe bytecode issues when copying source trees.

Looking at the python implementation of the import statement, you can find the stale check in _validate_bytecode_header(). By the looks of it, it extracts bytes 4 to 7 (incl) and compares it against the timecode of the source file. If those doesn't match, the bytecode is considered stalled and thus recompiled.

In the process, it also checks the length of the source file against the length of the source used to generate a given bytecode (stored in bytes 8 to 11).

In the python implementation, if one of those checks fails, the bytecode loader raises an ImportError catched by SourceLoader.get_code() that triggers a recompilation of the bytecode.

Note: That's how it's done in the python version of importlib. I guess there's no functionnal difference in the native version, but my C is a bit too rusty to dig into compiler code

126

answered Sep 21 '22 05:09

svvac

Related questions
                            
                                Django CORS Access-Control-Allow-Origin missing
                            
                                Dependencies between files with pytest-dependency?
                            
                                Spark is only using one worker machine when more are available
                            
                                cx_Freeze: “No module named 'codecs'” Windows 10
                            
                                How to efficiently pass function through?
                            
                                Fastest way to create a pandas column conditionally
                            
                                How to create asyncio stream reader/writer for stdin/stdout?
                            
                                Python Redis Queue (rq) - how to avoid preloading ML model for each job?
                            
                                Why can't eval find a variable defined in an outer function?
                            
                                Keras LSTM Autoencoder time-series reconstruction
                            
                                Running docker-compose from python [duplicate]
                            
                                If I cache a Spark Dataframe and then overwrite the reference, will the original data frame still be cached?
                            
                                Speed up Matplotlib?
                            
                                pypcap support for python 2.7? [closed]
                            
                                Simplest way to run Sphinx on one python file
                            
                                Getting the function for a compiled function object
                            
                                PyOpenCl: how to debug segmentation fault?
                            
                                Memory leak when using strings < 128KB in Python?
                            
                                TF-IDF implementations in python
                            
                                Copy file if it doesn't already exist [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With