I have skimmed through the PyPy implementation details and went through the source code as well, but PyPy's execution path is still not totally clear to me.
Sometimes Bytecode is produced, sometimes it is skipped for immediate machine-code compiling (interpreter level/app level code), But I can't figure out when and where exactly is the machine code produced, to be handed to the OS for binary execution through low-level instructions (RAM/CPU).
I managed to get that straight in the case of CPython, as there is a giant switch in ceval.c
- that is already compiled - which interprets bytecode and runs the corresponding code (in actual C actually). Makes sense.
But as far as PyPy is concerned, I did not manage to get a clear view on how this is done, specifically (I do not want to get into the various optimization details of PyPy, that's not what I am after here).
I would be satisfied with an answer that points to the PYPY source code, so to avoid "hearsay" and be able to see it "with my eyes" (I spotted the JIT backends part, under /rpython, with the various CPU architectures assemblers)
Your best guide is the pypy architecture documentation, and the actual JIT documentation.
What jumped out the most for me is this:
we have a tracing JIT that traces the interpreter written in RPython, rather than the user program that it interprets.
This is covered in more detail in the JIT overview.
It seems to be that the "core" is this (from here):
Once the meta-interpreter has verified that it has traced a loop, it decides how to compile what it has. There is an optional optimization phase between these actions which is covered future down this page. The backend converts the trace operations into assembly for the particular machine. It then hands the compiled loop back to the frontend. The next time the loop is seen in application code, the optimized assembly can be run instead of the normal interpreter.
This paper (PDF) might also be helpful.
Edit: Looking at the x86 backend rpython/jit/backend/x86/rx86.py
, the backend doesn't so much as compile but spit out machine code directly. Look at the X86_64_CodeBuilder
and AbstractX86CodeBuilder
classes. One level higher is the Assembler386
class in rpython/jit/backend/x86/assembler.py
. This assembler uses the MachineCodeBlockWrapper
from rpython/jit/backend/x86/codebuf.py
which is based on the X86_64_CodeBuilder
for x86-64.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With