Global Interpreter Lock and access to data (eg. for NumPy arrays)

Tags:

I am writing a C extension for Python, which should release the Global Interpreter Lock while it operates on data. I think I have understood the mechanism of the GIL fairly well, but one question remains: Can I access data in a Python object while the thread does not own the GIL? For example, I want to read data from a (big) NumPy array in the C function while I still want to allow other threads to do other things on the other CPU cores. The C function should

release the GIL with Py_BEGIN_ALLOW_THREADS
read and work on the data without using Python functions
even write data to previously constructed NumPy arrays
reacquire the GIL with Py_END_ALLOW_THREADS

Is this safe? Of course, other threads are not supposed to change the variables which the C function uses. But maybe there is one hidden source for errors: could the Python interpreter move an object, eg. by some sort of garbage collection, while the C function works on it in a separate thread?

To illustrate the question with a minimal example, consider the (minimal but complete) code below. Compile it (on Linux) with

gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -fPIC -I/usr/lib/pymodules/python2.7/numpy/core/include -I/usr/include/python2.7 -c gilexample.c -o gilexample.o
gcc -pthread -shared gilexample.o -o gilexample.so

and test it in Python with

import gilexample
gilexample.sum([1,2,3])

Is the code between Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS safe? It accesses the contents of a Python object, and I do not want to duplicate the (possibly large) array in memory.

#include <Python.h>
#include <numpy/arrayobject.h>

// The relevant function
static PyObject * sum(PyObject * const self, PyObject * const args) {
  PyObject * X;
  PyArg_ParseTuple(args, "O", &X);
  PyObject const * const X_double = PyArray_FROM_OTF(X, NPY_DOUBLE, NPY_ALIGNED);
  npy_intp const size = PyArray_SIZE(X_double);
  double * const data = (double *) PyArray_DATA(X_double);
  double sum = 0;

  Py_BEGIN_ALLOW_THREADS // IS THIS SAFE?

  npy_intp i;
  for (i=0; i<size; i++)
    sum += data[i];

  Py_END_ALLOW_THREADS

  Py_DECREF(X_double);
  return PyFloat_FromDouble(sum);
}

// Python interface code
// List the C methods that this extension provides.
static PyMethodDef gilexampleMethods[] = {
  {"sum", sum, METH_VARARGS},
  {NULL, NULL, 0, NULL}     /* Sentinel - marks the end of this structure */
};

// Tell Python about these methods.
PyMODINIT_FUNC initgilexample(void)  {
  (void) Py_InitModule("gilexample", gilexampleMethods);
  import_array();  // Must be present for NumPy.
}

460

asked Jan 11 '12 18:01

Daniel

1 Answers

Is this safe?

Strictly, no. I think you should move the calls to PyArray_SIZE and PyArray_DATA outside the GIL-less block; if you do that, you'll be operating on C data only. You might also want to increment the reference count on the object before going into the GIL-less block and decrement it afterwards.

After your edits, it should be safe. Don't forget to decrement the reference count afterwards.

120

answered Sep 18 '22 23:09

Fred Foo

Related questions
                            
                                Why does subprocess.Popen not work
                            
                                pyinstaller seems not to find a data file
                            
                                Adding a file-like object to a Zip file in Python
                            
                                Python gives 'Not well-formed xml' error because of presence of '&' characters
                            
                                How to obtain better results using NLTK pos tag
                            
                                Migrating a password field to Django
                            
                                Parsing FIX protocol in regex?
                            
                                pyusb: cannot set configuration
                            
                                Symfony and mechanize
                            
                                Wondering About GeoDjango and Mapping Services
                            
                                fast way to read from StringIO until some byte is encountered
                            
                                10*10 fold cross validation in scikit-learn?
                            
                                Python: argparse subcommand subcommand?
                            
                                Twisted MySQL adbapi return dictionary
                            
                                Is there a difference between `%`-format operator and `str.format()` in python regarding unicode and utf-8 encoding?
                            
                                How to properly handle and retain system shutdown (and SIGTERM) in order to finish its job in Python?
                            
                                How to plot with x-axis at the top of the figure?
                            
                                Python catch any exception, and print or log traceback with variable values
                            
                                Python Timeit and “global name ... is not defined”
                            
                                Send a non-ASCII POST request in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Global Interpreter Lock and access to data (eg. for NumPy arrays)

Tags:

python

python-c-api

numpy

Daniel

People also ask

1 Answers

Fred Foo

Recent Activity

Donate For Us