Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Embedding python in multithreaded C application

I'm embedding the python interpreter in a multithreaded C application and I'm a little confused as to what APIs I should use to ensure thread safety.

From what I gathered, when embedding python it is up to the embedder to take care of the GIL lock before calling any other Python C API call. This is done with these functions:

gstate = PyGILState_Ensure();
// do some python api calls, run python scripts
PyGILState_Release(gstate);

But this alone doesn't seem to be enough. I still got random crashes since it doesn't seem to provide mutual exclusion for the Python APIs.

After reading some more docs I also added:

PyEval_InitThreads();

right after the call to Py_IsInitialized() but that's where the confusing part comes. The docs state that this function:

Initialize and acquire the global interpreter lock

This suggests that when this function returns, the GIL is supposed to be locked and should be unlocked somehow. but in practice this doesn't seem to be required. With this line in place my multithreaded worked perfectly and mutual exclusion was maintained by the PyGILState_Ensure/Release functions.
When I tried adding PyEval_ReleaseLock() after PyEval_ReleaseLock() the app dead-locked pretty quickly in a subsequent call to PyImport_ExecCodeModule().

So what am I missing here?

like image 257
shoosh Avatar asked May 16 '12 19:05

shoosh


People also ask

Is Python good for multithreading?

Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.

Is Python suitable for highly concurrent multithreaded applications?

Both multithreading and multiprocessing allow Python code to run concurrently. Only multiprocessing will allow your code to be truly parallel. However, if your code is IO-heavy (like HTTP requests), then multithreading will still probably speed up your code.

Can Python threads run on multiple cores?

Key Takeaways. Python is NOT a single-threaded language. Python processes typically use a single thread because of the GIL. Despite the GIL, libraries that perform computationally heavy tasks like numpy, scipy and pytorch utilise C-based implementations under the hood, allowing the use of multiple cores.

How do you make a multithreaded application in Python?

Creating Thread Using Threading ModuleDefine a new subclass of the Thread class. Override the __init__(self [,args]) method to add additional arguments. Then, override the run(self [,args]) method to implement what the thread should do when started.


Video Answer


2 Answers

I had exactly the same problem and it is now solved by using PyEval_SaveThread() immediately after PyEval_InitThreads(), as you suggest above. However, my actual problem was that I used PyEval_InitThreads() after PyInitialise() which then caused PyGILState_Ensure() to block when called from different, subsequent native threads. In summary, this is what I do now:

  1. There is global variable:

    static int gil_init = 0; 
    
  2. From a main thread load the native C extension and start the Python interpreter:

    Py_Initialize() 
    
  3. From multiple other threads my app concurrently makes a lot of calls into the Python/C API:

    if (!gil_init) {
        gil_init = 1;
        PyEval_InitThreads();
        PyEval_SaveThread();
    }
    state = PyGILState_Ensure();
    // Call Python/C API functions...    
    PyGILState_Release(state);
    
  4. From the main thread stop the Python interpreter

    Py_Finalize()
    

All other solutions I've tried either caused random Python sigfaults or deadlock/blocking using PyGILState_Ensure().

The Python documentation really should be more clear on this and at least provide an example for both the embedding and extension use cases.

like image 86
forman Avatar answered Oct 29 '22 09:10

forman


Eventually I figured it out.
After

PyEval_InitThreads();

You need to call

PyEval_SaveThread();

While properly release the GIL for the main thread.

like image 41
shoosh Avatar answered Oct 29 '22 08:10

shoosh