Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When/Where does PyPy produce machine code?

Tags:

python

pypy

I have skimmed through the PyPy implementation details and went through the source code as well, but PyPy's execution path is still not totally clear to me.

Sometimes Bytecode is produced, sometimes it is skipped for immediate machine-code compiling (interpreter level/app level code), But I can't figure out when and where exactly is the machine code produced, to be handed to the OS for binary execution through low-level instructions (RAM/CPU).

I managed to get that straight in the case of CPython, as there is a giant switch in ceval.c - that is already compiled - which interprets bytecode and runs the corresponding code (in actual C actually). Makes sense.
But as far as PyPy is concerned, I did not manage to get a clear view on how this is done, specifically (I do not want to get into the various optimization details of PyPy, that's not what I am after here).

I would be satisfied with an answer that points to the PYPY source code, so to avoid "hearsay" and be able to see it "with my eyes" (I spotted the JIT backends part, under /rpython, with the various CPU architectures assemblers)

like image 349
Mehdi LAMRANI Avatar asked Aug 30 '20 20:08

Mehdi LAMRANI


1 Answers

Your best guide is the pypy architecture documentation, and the actual JIT documentation.

What jumped out the most for me is this:

we have a tracing JIT that traces the interpreter written in RPython, rather than the user program that it interprets.

This is covered in more detail in the JIT overview.

It seems to be that the "core" is this (from here):

Once the meta-interpreter has verified that it has traced a loop, it decides how to compile what it has. There is an optional optimization phase between these actions which is covered future down this page. The backend converts the trace operations into assembly for the particular machine. It then hands the compiled loop back to the frontend. The next time the loop is seen in application code, the optimized assembly can be run instead of the normal interpreter.

This paper (PDF) might also be helpful.

Edit: Looking at the x86 backend rpython/jit/backend/x86/rx86.py, the backend doesn't so much as compile but spit out machine code directly. Look at the X86_64_CodeBuilder and AbstractX86CodeBuilder classes. One level higher is the Assembler386 class in rpython/jit/backend/x86/assembler.py. This assembler uses the MachineCodeBlockWrapper from rpython/jit/backend/x86/codebuf.py which is based on the X86_64_CodeBuilder for x86-64.

like image 98
Roland Smith Avatar answered Sep 30 '22 18:09

Roland Smith