Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy C API: Link several object files

Tags:

python

c

gcc

numpy

api

I'm using the C API of numpy to write some functions for matrix calculation. Today I wanted to move some parts of my functions into a seperate .c file and use a header to declare them. Now I have a strange problem that has to do with numpy's import_array function. I've tried to simplify the problem as much as possible. At first there is the working program:

mytest.c

#include "mytest.h"

PyObject* my_sub_function() {
    npy_intp dims[2] = {2, 2};
    double data[] = {0.1, 0.2, 0.3, 0.4};

    PyArrayObject* matrix = (PyArrayObject*)PyArray_SimpleNew(2, dims, NPY_FLOAT64);
    memcpy(PyArray_DATA(matrix), data, sizeof(double) * dims[0] * dims[1]);

    return (PyObject*)matrix;
}

static PyObject* my_test_function(PyObject* self, PyObject* args) {
    return my_sub_function();
}

static PyMethodDef methods[] = {
    {"my_test_function", my_test_function, METH_VARARGS, ""},
    {0, 0, 0, 0}
};

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT, "mytest", 0, -1, methods
};

PyMODINIT_FUNC PyInit_mytest() {
    import_array();
    return PyModule_Create(&module);
}

mytest.h

#ifndef mytest_h
#define mytest_h

#include <Python.h>
#include <numpy/arrayobject.h>
#include <numpy/npy_common.h>

PyObject* my_sub_function();

#endif

Makefile

all: mytest.o sub.o
    gcc -shared -Wl,-soname,mytest.so -o mytest.so mytest.o

mytest.o: sub.o
    gcc -fPIC -c mytest.c `pkg-config --cflags python3`

clean:
    rm -rf *.so
    rm -rf *.o

Everything works as expected. I can call make and then load the module and call the function:

test.py

import mytest
print(mytest.my_test_function())

If I would remove import_array from the init function there would be a segfault, which is the behaviour that has been reported in many mailing lists and forums.

Now I just want to remove the whole function my_sub_function from mytest.c and move it into a file called sub.c:

#include "mytest.h"

PyObject* my_sub_function() {
    npy_intp dims[2] = {2, 2};
    double data[] = {0.1, 0.2, 0.3, 0.4};

    PyArrayObject* matrix = (PyArrayObject*)PyArray_SimpleNew(2, dims, NPY_FLOAT64);
    memcpy(PyArray_DATA(matrix), data, sizeof(double) * dims[0] * dims[1]);

    return (PyObject*)matrix;
}

The new Makefile is:

all: mytest.o sub.o
    gcc -shared -Wl,-soname,mytest.so -o mytest.so mytest.o sub.o

mytest.o:
    gcc -fPIC -c mytest.c `pkg-config --cflags python3`

sub.o:
    gcc -fPIC -c sub.c `pkg-config --cflags python3`

clean:
    rm -rf *.so
    rm -rf *.o

If I try to load the module and to call the function now the function call gives me a segfault. I can resolve the problem if I put a call to import_array to the top of my_sub_function, but I don't think that this is the way that function should be used.

So I'd like to know why this is happening and what's the "clean" way to split up a numpy module into several source files.

like image 766
blogsh Avatar asked Sep 03 '12 19:09

blogsh


1 Answers

By default, the import_array routine will only make the NumPy C API available within a single file. This is because it works through a table of function pointers stored in a static global variable (i.e. not exported, and only visible within the same file).

As mentioned in the documentation, you can change this behaviour with a few preprocessor definitions:

  1. In all files for your extension, define PY_ARRAY_UNIQUE_SYMBOL to a unique variable that is unlikely to conflict with other extensions. Including your extension's module name in the variable name would be a good idea.

  2. In every file except for the one where you call import_array, define the symbol NO_IMPORT_ARRAY

These symbols need to be defined before you include arrayobject.h in order for them to take effect.

like image 95
James Henstridge Avatar answered Oct 23 '22 04:10

James Henstridge