Note: The title is deliberately provocative (to make you click on it and want to close-vote the question) and I don't want to look preoccupied.
I've been reading and hearing more and more about PyPy. It's like a linear graph.
Why is PyPy so special? As far as I know implementations of dynamic languages written in the languages itself aren't such a rare thing, or am I not getting something?
Some people even call PyPy "the future" [of python], or see some sort of deep potential in this implementation. What exactly is the meaning of this?
PyPy aims to provide a common translation and support framework for producing implementations of dynamic languages, emphasizing a clean separation between language specification and implementation aspects.
PyPy is a drop-in replacement for the stock Python interpreter, CPython. Whereas CPython compiles Python to intermediate bytecode that is then interpreted by a virtual machine, PyPy uses just-in-time (JIT) compilation to translate Python code into machine-native assembly language.
Pypy is as fast as or faster than c/c++ in some applications/benchmarks. And with python (or interpreted langs in general) you gain a repl, a shorter write -> compile -> test loop, and generally speaking a higher rate of development.
What they have implemented in "normal Python" is the RPython "compiler" (the translation framework referred to in the block quote). This is burying the lede. Most of the performance comes from translation to C (which makes the interpreter not that much slower than CPython), and JIT, which makes hot paths much faster.
Good thing to be aware when talking about the PyPy project is that it aims to actually provide two deliverables: first is JIT compiler generator. Yes, generator, meaning that they are implementing a framework for writing implementations of highly dynamic programming languages, such as Python. The second one is the actual test of this framework, and is the PyPy Python interpreter implementation.
Now, there are multiple answers why PyPy is so special: the project development is running from 2004, started as a research project rather than from a company, reimplements Python in Python, implements a JIT compiler in Python, and can translate RPython (Python code with some limitations for the framework to be able to translate that code to C) to compiled binary.
The current version of PyPy is 99% compatible with CPython version 2.5, and can run Django, Twisted and many other Python programs. There used to be a limitation of not being able to run existing CPython C extensions, but that is also being addressed with cpyext module in PyPy. C API compatibility is possible and to some extent already implemented. JIT is also very real, see this pystone comparison.
With CPython:
Python 2.5.5 (r255:77872, Apr 21 2010, 08:44:16) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from test import pystone >>> pystone.main(1000000) Pystone(1.1) time for 1000000 passes = 12.28 This machine benchmarks at 81433.2 pystones/second
With PyPy:
Python 2.5.2 (75632, Jun 28 2010, 14:03:25) [PyPy 1.3.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. And now for something completely different: ``A radioactive cat has 18 half-lives.'' >>>> from test import pystone >>>> pystone.main(1000000) Pystone(1.1) time for 1000000 passes = 1.50009 This machine benchmarks at 666625 pystones/second
So you can get a nearly 10x speedup just by using PyPy on some calculations!
So, as PyPy project is slowly maturing and offering some advantages, it is attracting more interest from people trying to address speed issues in their code. An alternative to PyPy is unladden swallow (a Google project) which aims to speed up CPython implementation by using LLVM's JIT capabilities, but progress on unladden swallow was slowed because the developer needed to deal with bugs in LLVM.
So, to sum it up, I guess PyPy is regarded the future of Python because it's separating language specification from VM implementation. Features introduced in, eg. stackless Python, could then be implemented in PyPy with very little extra effort, because it's just altering language specification and you keep the shared code the same. Less code, less bugs, less merging, less effort.
And by writing, for example, a new bash shell implementation in RPython, you could get a JIT compiler for free and speed up many linux shell scripts without actually learning any heavy JIT knowledge.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With