I'm using the C API of numpy to write some functions for matrix calculation. Today I wanted to move some parts of my functions into a seperate .c file and use a header to declare them. Now I have a strange problem that has to do with numpy's import_array
function. I've tried to simplify the problem as much as possible. At first there is the working program:
mytest.c
#include "mytest.h"
PyObject* my_sub_function() {
npy_intp dims[2] = {2, 2};
double data[] = {0.1, 0.2, 0.3, 0.4};
PyArrayObject* matrix = (PyArrayObject*)PyArray_SimpleNew(2, dims, NPY_FLOAT64);
memcpy(PyArray_DATA(matrix), data, sizeof(double) * dims[0] * dims[1]);
return (PyObject*)matrix;
}
static PyObject* my_test_function(PyObject* self, PyObject* args) {
return my_sub_function();
}
static PyMethodDef methods[] = {
{"my_test_function", my_test_function, METH_VARARGS, ""},
{0, 0, 0, 0}
};
static struct PyModuleDef module = {
PyModuleDef_HEAD_INIT, "mytest", 0, -1, methods
};
PyMODINIT_FUNC PyInit_mytest() {
import_array();
return PyModule_Create(&module);
}
mytest.h
#ifndef mytest_h
#define mytest_h
#include <Python.h>
#include <numpy/arrayobject.h>
#include <numpy/npy_common.h>
PyObject* my_sub_function();
#endif
Makefile
all: mytest.o sub.o
gcc -shared -Wl,-soname,mytest.so -o mytest.so mytest.o
mytest.o: sub.o
gcc -fPIC -c mytest.c `pkg-config --cflags python3`
clean:
rm -rf *.so
rm -rf *.o
Everything works as expected. I can call make
and then load the module and call the function:
test.py
import mytest
print(mytest.my_test_function())
If I would remove import_array
from the init function there would be a segfault, which is the behaviour that has been reported in many mailing lists and forums.
Now I just want to remove the whole function my_sub_function
from mytest.c and move it into a file called sub.c:
#include "mytest.h"
PyObject* my_sub_function() {
npy_intp dims[2] = {2, 2};
double data[] = {0.1, 0.2, 0.3, 0.4};
PyArrayObject* matrix = (PyArrayObject*)PyArray_SimpleNew(2, dims, NPY_FLOAT64);
memcpy(PyArray_DATA(matrix), data, sizeof(double) * dims[0] * dims[1]);
return (PyObject*)matrix;
}
The new Makefile is:
all: mytest.o sub.o
gcc -shared -Wl,-soname,mytest.so -o mytest.so mytest.o sub.o
mytest.o:
gcc -fPIC -c mytest.c `pkg-config --cflags python3`
sub.o:
gcc -fPIC -c sub.c `pkg-config --cflags python3`
clean:
rm -rf *.so
rm -rf *.o
If I try to load the module and to call the function now the function call gives me a segfault. I can resolve the problem if I put a call to import_array
to the top of my_sub_function
, but I don't think that this is the way that function should be used.
So I'd like to know why this is happening and what's the "clean" way to split up a numpy module into several source files.
By default, the import_array
routine will only make the NumPy C API available within a single file. This is because it works through a table of function pointers stored in a static global variable (i.e. not exported, and only visible within the same file).
As mentioned in the documentation, you can change this behaviour with a few preprocessor definitions:
In all files for your extension, define PY_ARRAY_UNIQUE_SYMBOL
to a unique variable that is unlikely to conflict with other extensions. Including your extension's module name in the variable name would be a good idea.
In every file except for the one where you call import_array
, define the symbol NO_IMPORT_ARRAY
These symbols need to be defined before you include arrayobject.h
in order for them to take effect.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With