What are the implications of calling NumPy's C API functions from multiple threads?

Question

This is risky business, and I understand the Global Interpreter Lock to be a formidable foe of parallelism. However, if I'm using NumPy's C API (specifically the PyArray_DATA macro on a NumPy array), are there potential consequences to invoking it from multiple concurrent threads?

Note that I will still own the GIL and not be releasing it with NumPy's threading support. Also, even if NumPy makes no guarantees about thread safety but PyArray_DATA is thread-safe in practice, that's good enough for me.

I'm running Python 2.6.6 with NumPy 1.3.0 on Linux.

ide · Accepted Answer

Answering my own question here, but after poking into the source code for NumPy 1.3.0, I believe the answer is: Yes, PyArray_DATA is thread-safe.

PyArray_DATA is defined in ndarrayobject.h:

#define PyArray_DATA(obj) ((void *)(((PyArrayObject *)(obj))->data))

The PyArrayObject struct type is defined in the same file; the field of interest is:
```
char *data;
```
So now, the question is whether accessing data from multiple threads is safe or not.
Creating a new NumPy array from scratch (i.e., not deriving it from an existing data structure) passes a NULL data pointer to PyArray_NewFromDescr, defined in arrayobject.c.
This causes PyArray_NewFromDescr to invoke PyDataMem_NEW in order to allocate memory for the PyArrayObject's data field. This is simply a macro for malloc:
```
#define PyDataMem_NEW(size) ((char *)malloc(size))
```

In summary, PyArray_DATA is thread-safe and as long as the NumPy arrays are created separately, it is safe to write to them from different threads.

What are the implications of calling NumPy's C API functions from multiple threads?

Tags:

python

python-c-api

numpy

gil

ide

1 Answers

ide

Recent Activity

Donate For Us

What are the implications of calling NumPy's C API functions from multiple threads?

Tags:

python

python-c-api

numpy

gil

ide

1 Answers

ide

Related questions

Recent Activity

Donate For Us