Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to account for column-contiguous array when extending numpy with C

I have a C-function to normalize the rows of an array in log-space (this prevents numerical underflow).

The prototype of my C-function is as follows:

void normalize_logspace_matrix(size_t nrow, size_t ncol, double* mat);

You can see that it takes a pointer to an array and modifies it in place. The C-code of course assumes the data is saved as a C-contiguous array, i.e. row-contiguous.

I wrap the function as follows using Cython (imports and cdef extern from omitted):

def normalize_logspace(np.ndarray[np.double_t, ndim=2] mat):
    cdef Py_ssize_t n, d
    n = mat.shape[0]
    d = mat.shape[1]
    normalize_logspace_matrix(n, d, <double*> mat.data)
    return mat

Most of the time numpy-arrays are row-contiguous and the function works fine. However, if a numpy-array has been previously transposed the data is not copied around but just a new view into the data is returned. In this case my function fails because the array is no longer row-contiguous.

I can get around this by defining the array to have Fortran-contiguous order, such that after transposing it will be C-contiguous:

A = np.array([some_func(d) for d in range(D)], order='F').T
A = normalize_logspace(A)

Obviously that's very error-prone and the user has to take care that the array is in the correct order which is something that the user shouldn't need to care about in Python.

What's the best way how I can make this work with both row- and column-contiguous arrays? I assume that some kind of array-order checking in Cython is the way to go. Of course, I'd prefer a solution that doesn't require to copy the data into a new array, but I almost assume that's necessary.

like image 701
oceanhug Avatar asked Dec 12 '10 06:12

oceanhug


Video Answer


2 Answers

If you want to support arrays in C and Fortran order without ever copying, your C function needs to be flexible enough to support both orderings. This can be achieved by passing the strides of the NumPy array to the C function: Change the prototype to

void normalize_logspace_matrix(size_t nrow, size_t ncol, 
                               size_t rowstride, size_t colstride,
                               double* mat);

and the Cython call to

def normalize_logspace(np.ndarray[np.double_t, ndim=2] mat):
    cdef Py_ssize_t n, d, rowstride, colstride
    n = mat.shape[0]
    d = mat.shape[1]
    rowstride = mat.strides[0] // mat.itemsize
    colstride = mat.strides[1] // mat.itemsize
    normalize_logspace_matrix(n, d, rowstride, colstride, <double*> mat.data)
    return mat

Then, replace every occurence of mat[row*ncol + col] in your C code by mat[row*rowstride + col*colstride].

like image 63
Sven Marnach Avatar answered Oct 25 '22 18:10

Sven Marnach


In this case you really do want to create a copy of the input array (which can be a view on a real array) with guaranteed row-contiguous order. You can achieve this with something like this:

a = numpy.array(A, copy=True, order='C')

Also, consider taking a look at the exact array interface of Numpy (there is C part too).

like image 39
ulidtko Avatar answered Oct 25 '22 20:10

ulidtko