I'm embedding the python interpreter in a multithreaded C application and I'm a little confused as to what APIs I should use to ensure thread safety.
From what I gathered, when embedding python it is up to the embedder to take care of the GIL lock before calling any other Python C API call. This is done with these functions:
gstate = PyGILState_Ensure();
// do some python api calls, run python scripts
PyGILState_Release(gstate);
But this alone doesn't seem to be enough. I still got random crashes since it doesn't seem to provide mutual exclusion for the Python APIs.
After reading some more docs I also added:
PyEval_InitThreads();
right after the call to Py_IsInitialized()
but that's where the confusing part comes. The docs state that this function:
Initialize and acquire the global interpreter lock
This suggests that when this function returns, the GIL is supposed to be locked and should be unlocked somehow. but in practice this doesn't seem to be required. With this line in place my multithreaded worked perfectly and mutual exclusion was maintained by the PyGILState_Ensure/Release
functions.
When I tried adding PyEval_ReleaseLock()
after PyEval_ReleaseLock()
the app dead-locked pretty quickly in a subsequent call to PyImport_ExecCodeModule()
.
So what am I missing here?
Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.
Both multithreading and multiprocessing allow Python code to run concurrently. Only multiprocessing will allow your code to be truly parallel. However, if your code is IO-heavy (like HTTP requests), then multithreading will still probably speed up your code.
Key Takeaways. Python is NOT a single-threaded language. Python processes typically use a single thread because of the GIL. Despite the GIL, libraries that perform computationally heavy tasks like numpy, scipy and pytorch utilise C-based implementations under the hood, allowing the use of multiple cores.
Creating Thread Using Threading ModuleDefine a new subclass of the Thread class. Override the __init__(self [,args]) method to add additional arguments. Then, override the run(self [,args]) method to implement what the thread should do when started.
I had exactly the same problem and it is now solved by using PyEval_SaveThread()
immediately after PyEval_InitThreads()
, as you suggest above. However, my actual problem was that I used PyEval_InitThreads()
after PyInitialise()
which then caused PyGILState_Ensure()
to block when called from different, subsequent native threads. In summary, this is what I do now:
There is global variable:
static int gil_init = 0;
From a main thread load the native C extension and start the Python interpreter:
Py_Initialize()
From multiple other threads my app concurrently makes a lot of calls into the Python/C API:
if (!gil_init) {
gil_init = 1;
PyEval_InitThreads();
PyEval_SaveThread();
}
state = PyGILState_Ensure();
// Call Python/C API functions...
PyGILState_Release(state);
From the main thread stop the Python interpreter
Py_Finalize()
All other solutions I've tried either caused random Python sigfaults or deadlock/blocking using PyGILState_Ensure()
.
The Python documentation really should be more clear on this and at least provide an example for both the embedding and extension use cases.
Eventually I figured it out.
After
PyEval_InitThreads();
You need to call
PyEval_SaveThread();
While properly release the GIL for the main thread.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With