The documentation for CPython thread support is frustratingly contradictory and sparse.
In general, it seems that everyone agrees that multi-threaded C applications which embed Python must always acquire the GIL before calling the Python interpreter. Typically, this is done by:
PyGILState_STATE s = PyGILState_Ensure();
/* do stuff with Python */
PyGILState_Release(s);
The docs pretty much spell this out very plainly: https://docs.python.org/2/c-api/init.html#non-python-created-threads
However, in practice, getting a multi-threaded C program that embeds Python to actually work smoothly is another story. There seem to be a lot of quirks and surprises, even if you follow the docs exactly.
For example, it seems that behind-the-scenes, Python distinguishes between the "main thread" (which I guess is the thread that calls Py_Initialize
) and other threads. Specifically, any attempt to acquire the GIL and run Python code in the "main" thread has consistently failed when I attempt to do so - (at least with Python 3.x), the program aborts with a Fatal Python error: drop_gil: GIL is not locked
message, which is silly because of course the GIL is locked!
Example:
int main()
{
Py_Initialize();
PyEval_InitThreads();
PyEval_ReleaseLock();
assert(PyEval_ThreadsInitialized());
PyGILState_STATE s = PyGILState_Ensure();
const char* command = "x = 5\nfor i in range(0,10): print(x*i)";
PyRun_SimpleString(command);
PyGILState_Release(s);
Py_Finalize();
return 0;
}
This simple program aborts with a "GIL is not locked error", even though I clearly locked it. However, if I spawn another thread, and attempt to acquire the GIL in that thread, everything works.
So CPython seems to have an (undocumented) concept of a "main thread", which is somehow different from secondary threads spawned by C.
Question: Is this documented anywhere? Has anyone had any experience that would shed some light on what exactly the rules are for acquiring the GIL, and if being in the "main" thread versus a child thread is supposed to have any bearing on this?
PS: Also, I've noted that PyEval_ReleaseLock
is a deprecated API call, yet I've not seen any alternative which actually works. If you don't call PyEval_ReleaseLock
after calling PyEval_InitThreads
, your program immediately hangs. However, the newer alternative mentioned in the docs, PyEval_SaveThread
has never worked in practice for me - it immediately seg faults, at least if I call it in the "main" thread".
A lock can be locked using the acquire() method. Once a thread has acquired the lock, all subsequent attempts to acquire the lock are blocked until it is released. The lock can be released using the release() method. Calling the release() method on a lock, in an unlocked state, results in an error.
The Python GIL, or Global Interpreter Lock, is a mechanism in CPython (the most common implementation of Python) that serves to serialize operations involving the Python bytecode interpreter, and provides useful safety guarantees for internal object and interpreter state.
A variable shared among multiple function calls can be locked. This requires first that an instance of the threading. Lock class be created alongside the shared variable. Each time the shared variable is used or modified it must be protected by the lock.
A thread can have more than one lock. Each time a LOCK THREAD statement executes in the thread, the number of locks held by that thread increases by one.
This simple program aborts with a "GIL is not locked error", even though I clearly locked it.
You locked the GIL, but then you proceeded to release it in PyGILState_Release
, which means you invoked Py_Finalize
without the GIL held.
Has anyone had any experience that would shed some light on what exactly the rules are for acquiring the GIL
The intended way to think of the GIL is that, once you invoke PyEval_InitThreads()
, someone always holds the GIL, or has released it only temporarily using Py_BEGIN_ALLOW_THREADS
and Py_END_ALLOW_THREADS
. See this answer for an extended discussion of a very similar confusion.
In your case the correct way to write the sample program would be as follows:
#include <Python.h>
static void various()
{
// here we don't have the GIL and can run non-Python code without
// blocking Python
PyGILState_STATE s = PyGILState_Ensure();
// from this line, we have the GIL, and we can run Python code
const char* command = "x = 5\nfor i in range(0,10): print(x*i)";
PyRun_SimpleString(command);
PyGILState_Release(s);
// from this line, we no longer have the GIL
}
int main()
{
Py_Initialize();
PyEval_InitThreads();
// here we have the GIL
assert(PyEval_ThreadsInitialized());
Py_BEGIN_ALLOW_THREADS
// here we no longer have the GIL, although various() is free to
// (temporarily) re-acquire it
various();
Py_END_ALLOW_THREADS
// here we again have the GIL, which is why we can call Py_Finalize
Py_Finalize();
// at this point the GIL no longer exists
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With