I recently tried PyPy
and was intrigued by the approach. I have lots of C extensions for Python, which all use PyArray_DATA()
to obtain a pointer to the data sections of numpy
arrays. Unfortunately, PyPy doesn't appear to export an equivalent for their numpypy
arrays in their cpyext
module, so I tried following the recommendation on their website to use ctypes
. This pushes the task of obtaining the pointer to the Python level.
There appear to be two ways:
import ctypes as C
p_t = C.POINTER(C.c_double)
def get_ptr_ctypes(x):
return x.ctypes.data_as(p_t)
def get_ptr_array(x):
return C.cast(x.__array_interface__['data'][0], p_t)
Only the second one works on PyPy, so for compatibility the choice is clear. For CPython, both are slow as hell and a complete bottleneck for my application! Is there a fast and portable way of obtaining this pointer? Or is there an equivalent of PyArray_DATA()
for PyPy (possibly undocumented)?
I still haven't found an entirely satisfactory solution, but nevertheless there is something one can do to obtain the pointer with a lot less overhead in CPython. First off, the reason why both ways mentioned above are so slow is that both .ctypes
and .__array_interface__
are on-demand attributes, which are set by array_ctypes_get()
and array_interface_get()
in numpy/numpy/core/src/multiarray/getset.c
. The first imports ctypes and creates a numpy.core._internal._ctypes
instance, while the second one creates a new dictionary and populates it with lots of unnecessary stuff in addition to the data pointer.
There is nothing one can do on the Python level about this overhead, but one can write a micro-module on the C-level that bypasses most of the overhead:
#include <Python.h>
#include <numpy/arrayobject.h>
PyObject *_get_ptr(PyObject *self, PyObject *obj) {
return PyLong_FromVoidPtr(PyArray_DATA(obj));
}
static PyMethodDef methods[] = {
{"_get_ptr", _get_ptr, METH_O, "Wrapper to PyArray_DATA()"},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initaccel(void) {
Py_InitModule("accel", methods);
}
Compile as usual as an Extension in setup.py
, and import as
try:
from accel import _get_ptr
def get_ptr(x):
return C.cast(_get_ptr(x), p_t)
except ImportError:
get_ptr = get_ptr_array
On PyPy, from accel import _get_ptr
will fail and get_ptr
will fall back to get_ptr_array
, which works with Numpypy.
As far as performance goes, for light-weight C function calls, ctypes + accel._get_ptr()
is still quite a bit slower than the native CPython extension, which has essentially no overhead. It is of course much faster than get_ptr_ctypes()
and get_ptr_array()
above, so that the overhead may become insignificant for medium-weight C function calls.
One has gained compatibility with PyPy, although I have to say that after spending quite a bit of time trying to evaluate PyPy for my scientific computation applications, I don't see a future for it as long as they (quite stubbornly) refuse to support the full CPython API.
Update
I found that ctypes.cast()
was now becoming the bottleneck after introducing accel._get_ptr()
. One can get rid of the casts by declaring all pointers in the interface as ctypes.c_void_p
. This is what I ended up with:
def get_ptr_ctypes2(x):
return x.ctypes._data
def get_ptr_array(x):
return x.__array_interface__['data'][0]
try:
from accel import _get_ptr as get_ptr
except ImportError:
get_ptr = get_ptr_array
Here, get_ptr_ctypes2()
avoids the cast by accessing the hidden ndarray.ctypes._data
attribute directly. Here are some timing results for calling heavy-weight and light-weight C functions from Python:
heavy C (few calls) light C (many calls)
ctypes + get_ptr_ctypes(): 0.71 s 15.40 s
ctypes + get_ptr_ctypes2(): 0.68 s 13.30 s
ctypes + get_ptr_array(): 0.65 s 11.50 s
ctypes + accel._get_ptr(): 0.63 s 9.47 s
native CPython: 0.62 s 8.54 s
Cython (no decorators): 0.64 s 9.96 s
So, with accel._get_ptr()
and no ctypes.cast()
s, ctypes' speed is actually competitive with a native CPython extension. So I just have to wait until someone rewrites h5py
, matplotlib
and scipy
with ctypes to be able to try PyPy for anything serious...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With