Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python C API - Is it thread safe?

I have a C extension that is called from my multithreaded Python application. I use a static variable i somewhere in a C function, and I have a few i++ statements later on that can be run from different Python threads (that variable is only used in my C code though, I don't yield it to Python).

For some reason I haven't met any race condition so far, but I wonder if it's just luck...

I don't have any thread-related C code (no Py_BEGIN_ALLOW_THREADS or anything).

I know that the GIL only guarantees single bytecode instructions to be atomic and thread-safe, thus statements as i+=1 in Python are not thread-safe.

But I don't know about a i++ instruction in a C extension. Any help ?

like image 769
DenverCoder9 Avatar asked Oct 18 '22 17:10

DenverCoder9


1 Answers

Python will not release the GIL when you are running C code (unless you either tell it to or cause the execution of Python code - see the warning note at the bottom!). It only releases the GIL just before a bytecode instruction (not during) and from the interpreter's point of view running a C function is part of executing the CALL_FUNCTION bytecode.* (Unfortunately I can't find a reference for this paragraph currently, but I'm almost certain it's right)

Therefore, unless you do anything specific your C code will be the only thread running and thus any operation you do in it should be thread safe.

If you specifically want to release the GIL - for example because you're doing a long calculation which doesn't interfere with Python, reading from a file, or sleeping while waiting for something else to happen - then the easiest way is to do Py_BEGIN_ALLOW_THREADS then Py_END_ALLOW_THREADS when you want to get it back. During this block you cannot use most Python API functions and it's your responsibility to ensure thread safety in C. The easiest way to do this is to only use local variables and not read or write any global state.

If you've already got a C thread running without the GIL (thread A) then simply holding the GIL in thread B does not guarantee that thread A won't modify C global variables. To be safe you need to ensure that you never modify global state without some kind of locking mechanism (either the Python GIL or a C mechanism) in all your C functions.


Additional thought

* One place where the GIL can be released in C code is if the C code calls something that causes Python code to be executed. This might be through using PyObject_Call. A less obvious place would be if Py_DECREF caused a destructor to be executed. You'd have the GIL back by the time your C code resumed, but you could no longer guarantee that global objects were unchanged. This obvious doesn't affect simple C like x++.


Belated Edit:

It should be emphasised that it's really, really, really easy to cause the execution of Python code. For this reason you shouldn't use the GIL in place of a mutex or actual locking mechanism. You should only consider it for operations that are really atomic (i.e. a single C API call) or entirely on non-Python C objects. You won't lose the GIL unexpected while executing C Code, but a lot of C API calls may release the GIL, do something else, and then regain the GIL before returning to your C code.

The purpose the GIL is to make sure that the Python internals don't get corrupted. The GIL will continue to serve this purpose within an extension module. However race conditions that involve valid Python objects arranged in ways you don't expect are still available to you. For example:

PySequence_SetItem(some_list, 0, some_item);
PyObject* item = PySequence_GetItem(some_list, 0);
assert(item == some_item); // may not be true 
// the destructor of the previous contents of item 0 may have released the GIL
like image 66
DavidW Avatar answered Oct 20 '22 05:10

DavidW