Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CPython - locking the GIL in the main thread

Tags:

python

c

cpython

The documentation for CPython thread support is frustratingly contradictory and sparse.

In general, it seems that everyone agrees that multi-threaded C applications which embed Python must always acquire the GIL before calling the Python interpreter. Typically, this is done by:

PyGILState_STATE s = PyGILState_Ensure();

/* do stuff with Python */

PyGILState_Release(s);

The docs pretty much spell this out very plainly: https://docs.python.org/2/c-api/init.html#non-python-created-threads

However, in practice, getting a multi-threaded C program that embeds Python to actually work smoothly is another story. There seem to be a lot of quirks and surprises, even if you follow the docs exactly.

For example, it seems that behind-the-scenes, Python distinguishes between the "main thread" (which I guess is the thread that calls Py_Initialize) and other threads. Specifically, any attempt to acquire the GIL and run Python code in the "main" thread has consistently failed when I attempt to do so - (at least with Python 3.x), the program aborts with a Fatal Python error: drop_gil: GIL is not locked message, which is silly because of course the GIL is locked!

Example:

int main()
{
    Py_Initialize();
    PyEval_InitThreads();
    PyEval_ReleaseLock();

    assert(PyEval_ThreadsInitialized());

    PyGILState_STATE s = PyGILState_Ensure();

    const char* command = "x = 5\nfor i in range(0,10): print(x*i)";
    PyRun_SimpleString(command);

    PyGILState_Release(s);
    Py_Finalize();

    return 0;
}

This simple program aborts with a "GIL is not locked error", even though I clearly locked it. However, if I spawn another thread, and attempt to acquire the GIL in that thread, everything works.

So CPython seems to have an (undocumented) concept of a "main thread", which is somehow different from secondary threads spawned by C.

Question: Is this documented anywhere? Has anyone had any experience that would shed some light on what exactly the rules are for acquiring the GIL, and if being in the "main" thread versus a child thread is supposed to have any bearing on this?

PS: Also, I've noted that PyEval_ReleaseLock is a deprecated API call, yet I've not seen any alternative which actually works. If you don't call PyEval_ReleaseLock after calling PyEval_InitThreads, your program immediately hangs. However, the newer alternative mentioned in the docs, PyEval_SaveThread has never worked in practice for me - it immediately seg faults, at least if I call it in the "main" thread".

like image 236
Siler Avatar asked Jun 30 '14 21:06

Siler


People also ask

How do you lock a thread in Python?

A lock can be locked using the acquire() method. Once a thread has acquired the lock, all subsequent attempts to acquire the lock are blocked until it is released. The lock can be released using the release() method. Calling the release() method on a lock, in an unlocked state, results in an error.

Why does Python have a global interpreter lock?

The Python GIL, or Global Interpreter Lock, is a mechanism in CPython (the most common implementation of Python) that serves to serialize operations involving the Python bytecode interpreter, and provides useful safety guarantees for internal object and interpreter state.

Can I lock variable in Python?

A variable shared among multiple function calls can be locked. This requires first that an instance of the threading. Lock class be created alongside the shared variable. Each time the shared variable is used or modified it must be protected by the lock.

Can a thread be locked?

A thread can have more than one lock. Each time a LOCK THREAD statement executes in the thread, the number of locks held by that thread increases by one.


1 Answers

This simple program aborts with a "GIL is not locked error", even though I clearly locked it.

You locked the GIL, but then you proceeded to release it in PyGILState_Release, which means you invoked Py_Finalize without the GIL held.

Has anyone had any experience that would shed some light on what exactly the rules are for acquiring the GIL

The intended way to think of the GIL is that, once you invoke PyEval_InitThreads(), someone always holds the GIL, or has released it only temporarily using Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS. See this answer for an extended discussion of a very similar confusion.

In your case the correct way to write the sample program would be as follows:

#include <Python.h>

static void various()
{
    // here we don't have the GIL and can run non-Python code without
    // blocking Python

    PyGILState_STATE s = PyGILState_Ensure();
    // from this line, we have the GIL, and we can run Python code

    const char* command = "x = 5\nfor i in range(0,10): print(x*i)";
    PyRun_SimpleString(command);

    PyGILState_Release(s);
    // from this line, we no longer have the GIL
}

int main()
{
    Py_Initialize();
    PyEval_InitThreads();
    // here we have the GIL
    assert(PyEval_ThreadsInitialized());

    Py_BEGIN_ALLOW_THREADS
    // here we no longer have the GIL, although various() is free to
    // (temporarily) re-acquire it
    various();
    Py_END_ALLOW_THREADS

    // here we again have the GIL, which is why we can call Py_Finalize 
    Py_Finalize();

    // at this point the GIL no longer exists
    return 0;
}
like image 147
user4815162342 Avatar answered Sep 23 '22 11:09

user4815162342