How do greenlets work?

Tags:

python

How are greenlets implemented? Python uses the C stack for the interpreter and it heap-allocates Python stack frames, but beyond that, how does it allocate/swap stacks, how does it hook into the interpreter and function call mechanisms, and how does this interact with C extensions? (Any quirks)?

There are some comments at the top of greenlet.c in the source, but they're a bit opaque. FWIW I'm coming from the perspective of someone who is unfamiliar with CPython internals but is very familiar with low-level systems programming, C, threads, events, coroutines/cooperative threads, kernel programming, etc.

(Some data points: they don't use ucontext.h and they do 2x memcpy, alloc, and free on every context switch.)

875

asked Jul 28 '10 00:07

Yang

2 Answers

When a python program runs, you have essentially two pieces of code running under the hood.

First, the CPython interpreter C code running and using the standard C-stack to save its internal stack-frames. Second, the actual python interpreted bytecode which does not use the C-stack, but rather uses the heap to save its stack-frames. A greenlet is just standard python code and thus behaves identically.

Now in a typical microthreaded application, you'd have thousands if not millions of microthreads (greenlets) switching all over the place. Each switch is essentially equivalent to a function call with a deferred return (so to speak) and thus will use a bit of stack. Problem is, the C-stack of the interpreter will sooner or later hit a stack overflow. This is exactly what the greenlet extension aimed at, it is designed to move pieces of the stack back and forth to/from the heap in order to avoid this problem.

As you know, there are three fundamental events with greenlets, a spawn, a switch, and a return, so let's look at those in turn:

A) A Spawn

The newly spawned greenlet is associated with its own base address in the stack (where we currently are). Apart from that, nothing special happens. The python code of the newly spawned greenlet uses the heap in a normal way and the interpreter continues using the C-stack as usual.

B) A Switch

When a greenlet is switched to from a switching greenlet, the relevant part of the C-stack (starting from the base address of the switchng greenlet) is copied to the heap. The copied C-stack area is freed and the switched greenlet's interpreter previously saved stack data is copied from the heap to the newly freed C-stack area. The python code of the switched greenlet continues using the heap in a normal way. Of course the extension code keeps track of all of this (which heap section goes to which greenlet and so on).

C) A Return

The stack is untouched and the heap area of the returning greenlet is freed by the python garbage collector.

Basically this is it, many more details and explanations can be found at (http://www.stackless.com/pipermail/stackless-dev/2004-March/000022.html) or just by reading the code as pointed in Alex's answer.

187

answered Sep 28 '22 23:09

Rabih Kodeih

If get and study the greenlet's sources, you'll see at the top of greenlet.c a long comment that starts at line 16 with the following summary...:

A PyGreenlet is a range of C stack addresses that must be saved and restored in such a way that the full range of the stack contains valid data when we switch to it.

and continues to line 82, summarizing exactly what you're asking about. Have you studies these lines (and the following 1000+ implementing them;-)...? I don't see a way to further squeeze these 66 lines down while still making sense, nor any added value in copying and pasting them here.

Basically, you'll see there is no real "hooking" to speak of (the C level stack is switched back and forth "under the interpreter's nose", so to speak) except for the delicate interactions with thread state in multi-threaded code, and the saving and restoring of a greenlet's state from/to the stack is based on memcpy calls plus some calls to the Python memory manager to allocate/reallocate and free space coming from, or going back to, the stack. The three functions in line 227-295 handle the grunt work, and they're wrapped in a couple C macros at 298-310 "in order to simplify maintenance", as the comment there says.

The interface through which other C extensions can interact with the greenlet extension is implemented at lines 956-1045, and exposed through the "CObject API" (via greenlet.h, of course) documented here.

answered Sep 28 '22 22:09

Alex Martelli

Related questions
                            
                                What does "app.run(host='0.0.0.0') " mean in Flask [duplicate]
                            
                                Uninstall python built from source?
                            
                                OData Python Library available?
                            
                                Python multiprocess profiling
                            
                                Converting a PDF to a series of images with Python
                            
                                Python spawn off a child subprocess, detach, and exit
                            
                                python tilde unary operator as negation numpy bool array
                            
                                difference between command prompt and anaconda prompt
                            
                                Python Multiprocessing Process or Pool for what I am doing?
                            
                                Mako or Jinja2? [closed]
                            
                                TypeError: 'int' object is not subscriptable
                            
                                How do I mock a django signal handler?
                            
                                Is there a way to check if NumPy arrays share the same data?
                            
                                Writing a Python extension in Go (Golang)
                            
                                SQLAlchemy - performing a bulk upsert (if exists, update, else insert) in postgresql
                            
                                Monkey patching a @property
                            
                                How to use collections.abc from both Python 3.8+ and Python 2.7
                            
                                why does my colorbar have lines in it?
                            
                                Is there a Python equivalent to Perl's Data::Dumper for inspecting data structures?
                            
                                Improving the extraction of human names with nltk [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With