Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Declaring numpy array and c pointer in cython

In my code I usally use numpy arrays to interface between methods and classes. Optimizing the core parts of my program I use cython with c pointers of those numpy arrays. Unforunately, the way I'm currently declaring the arrays is quite long.

For example, let's say I have a method which should return a numpy array someArrayNumpy, but inside the function pointers *someArrayPointers should be used for speed. This is how I usually declare this:

cdef:
    numpy.ndarray someArrayNumpy = numpy.zeros(someArraySize)
    numpy.ndarray[numpy.double_t, ndim=1] someArrayBuff = someArrayNumpy
    double *someArrayPointers = <double *> someArrayBuff.data

[... some Code ...]

return someArrayNumpy

As you can see, this takes up 3 lines of code for basically one array, and often I have to declare more of those arrays.

Is there a more compact/clever way to do this? I think I am missing something.

EDIT:

So because it was asked by J. Martinot-Lagarde I timed C pointers and "numpy pointers". The code was basically

for ii in range(someArraySize):
    someArrayPointers[ii] += 1

and

for ii in range(someArraySize):
    someArrayBuff[ii] += 1

with the definitions from above, but I added "ndim=1, mode='c'" just to make sure. Results are for someArraySize = 1e8 (time in ms):

testMartinot("cPointers")
531.276941299
testMartinot("numpyPointers")
498.730182648

That's what I roughly remember from previous/different benchmarks.

like image 691
oli Avatar asked Jul 10 '13 11:07

oli


1 Answers

You're actually declaring two numpy arrays here, the first one is generic and the second one has a specific dtype. You can skip the first line, someArrayBuff is a ndarray.

This gives :

numpy.ndarray[numpy.double_t] someArrayNumpy = numpy.zeros(someArraySize)
double *someArrayPointers = <double *> someArrayNumpy.data

You need at least two lines because you're using someArrayPointers and returning someArrayNumpy so you have to declare them.


As a side note, are you sure that pointers are faster than ndarrays, if you declare the type and the number of dimensions of the array ?

numpy.ndarray[numpy.double_t, ndim=2] someArrayNumpy = numpy.zeros(someArraySize)
like image 121
J. Martinot-Lagarde Avatar answered Oct 02 '22 05:10

J. Martinot-Lagarde