Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Portable/fast way to obtain a pointer to Numpy/Numpypy data

I recently tried PyPy and was intrigued by the approach. I have lots of C extensions for Python, which all use PyArray_DATA() to obtain a pointer to the data sections of numpy arrays. Unfortunately, PyPy doesn't appear to export an equivalent for their numpypy arrays in their cpyext module, so I tried following the recommendation on their website to use ctypes. This pushes the task of obtaining the pointer to the Python level.

There appear to be two ways:

import ctypes as C
p_t = C.POINTER(C.c_double)

def get_ptr_ctypes(x):
    return x.ctypes.data_as(p_t)

def get_ptr_array(x):
    return C.cast(x.__array_interface__['data'][0], p_t)

Only the second one works on PyPy, so for compatibility the choice is clear. For CPython, both are slow as hell and a complete bottleneck for my application! Is there a fast and portable way of obtaining this pointer? Or is there an equivalent of PyArray_DATA() for PyPy (possibly undocumented)?

like image 272
Stefan Avatar asked Mar 04 '13 20:03

Stefan


1 Answers

I still haven't found an entirely satisfactory solution, but nevertheless there is something one can do to obtain the pointer with a lot less overhead in CPython. First off, the reason why both ways mentioned above are so slow is that both .ctypes and .__array_interface__ are on-demand attributes, which are set by array_ctypes_get() and array_interface_get() in numpy/numpy/core/src/multiarray/getset.c. The first imports ctypes and creates a numpy.core._internal._ctypes instance, while the second one creates a new dictionary and populates it with lots of unnecessary stuff in addition to the data pointer.

There is nothing one can do on the Python level about this overhead, but one can write a micro-module on the C-level that bypasses most of the overhead:

#include <Python.h>
#include <numpy/arrayobject.h>

PyObject *_get_ptr(PyObject *self, PyObject *obj) {
    return PyLong_FromVoidPtr(PyArray_DATA(obj));
}

static PyMethodDef methods[] = {
    {"_get_ptr", _get_ptr, METH_O, "Wrapper to PyArray_DATA()"},
    {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC initaccel(void) {
    Py_InitModule("accel", methods);
}

Compile as usual as an Extension in setup.py, and import as

try:
    from accel import _get_ptr
    def get_ptr(x):
        return C.cast(_get_ptr(x), p_t)
except ImportError:
    get_ptr = get_ptr_array

On PyPy, from accel import _get_ptr will fail and get_ptr will fall back to get_ptr_array, which works with Numpypy.

As far as performance goes, for light-weight C function calls, ctypes + accel._get_ptr() is still quite a bit slower than the native CPython extension, which has essentially no overhead. It is of course much faster than get_ptr_ctypes() and get_ptr_array() above, so that the overhead may become insignificant for medium-weight C function calls.

One has gained compatibility with PyPy, although I have to say that after spending quite a bit of time trying to evaluate PyPy for my scientific computation applications, I don't see a future for it as long as they (quite stubbornly) refuse to support the full CPython API.

Update

I found that ctypes.cast() was now becoming the bottleneck after introducing accel._get_ptr(). One can get rid of the casts by declaring all pointers in the interface as ctypes.c_void_p. This is what I ended up with:

def get_ptr_ctypes2(x):
    return x.ctypes._data

def get_ptr_array(x):
    return x.__array_interface__['data'][0]

try:
    from accel import _get_ptr as get_ptr
except ImportError:
    get_ptr = get_ptr_array

Here, get_ptr_ctypes2() avoids the cast by accessing the hidden ndarray.ctypes._data attribute directly. Here are some timing results for calling heavy-weight and light-weight C functions from Python:

                             heavy C (few calls)      light C (many calls)
ctypes + get_ptr_ctypes():         0.71 s                   15.40 s
ctypes + get_ptr_ctypes2():        0.68 s                   13.30 s
ctypes + get_ptr_array():          0.65 s                   11.50 s
ctypes + accel._get_ptr():         0.63 s                    9.47 s

native CPython:                    0.62 s                    8.54 s
Cython (no decorators):            0.64 s                    9.96 s

So, with accel._get_ptr() and no ctypes.cast()s, ctypes' speed is actually competitive with a native CPython extension. So I just have to wait until someone rewrites h5py, matplotlib and scipy with ctypes to be able to try PyPy for anything serious...

like image 80
Stefan Avatar answered Sep 21 '22 09:09

Stefan