Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does one acces numpy multidimensionnal array in c extensions?

I have been struggling for a few day to understand the access to numpy arrays in C extension, but I've trouble understanding the documentation.

Edit: Here is the code I would like to port to c (the grav function)

import numpy as np

def grav(p, M):
    G = 6.67408*10**-2     # m³/s²T
    l = len(p[0])
    a = np.empty(shape=(2, l))
    a[:, 0] = 0
    for b in range(1, l):
        # computing the distance between body #b and all previous
        d = p[:, b:b+1] - p[:, :b]
        d2 = (d*d).sum(axis=0)
        d2[d2==0] = 1
        XXX = G * d * d2**(-1.5)

        # computing Newton formula : 
        # acceleration undergone by b from all previous
        a[:, b] = -(M[None, :b] * XXX).sum(axis=1)

        # computing Newton formula : adding for each previous,
        # acceleration undergone by from b
        a[:, :b] += M[b] * XXX
    return a

system_p = np.array([[1., 2., 3., 4., 5., 6., 7., 9., 4., 0.],
                     [3., 2., 5., 6., 3., 5., 6., 3., 5., 8.]])
system_M = np.array( [3., 5., 1., 2., 4., 5., 4., 5., 6., 8.])

system_a = grav(system_p, system_M)
for i in range(len(system_p[0])):
    print('body {:2} mass = {}(ton), position = {}(m), '
          'acceleration = [{:8.4f} {:8.4f}](m/s²)'.format(i,
              system_M[i], system_p[:, i], system_a[0, i], system_a[1, i]))

I've found here a simple example using a simple iterator. It works perfectly well, but it doesn't go beyond one dimensional array and provides no information about how to proceed when your array has multiple dimensions and you wish to iterate on a subset of them or you wish to specify in which order (e.g. row-wise/column-wise) you want to iterate them. i.e. with this method, you can iterate through a multidimensional array, but only in one single way.

Edit : It appears that NpyIter_MultiNew is not related with multidimensionnal iteration but with iterating multiple arrays in one go.
Looking in the documentation, I found this function:

NpyIter* NpyIter_MultiNew(npy_intp nop,
                          PyArrayObject** op,
                          npy_uint32 flags,
                          NPY_ORDER order,
                          NPY_CASTING casting,
                          npy_uint32* op_flags,
                          PyArray_Descr** op_dtypes)

which might be what I need, but I don't even understand the first sentence of the description:

Creates an iterator for broadcasting the nop array objects provided in op, [...]

What is this «nop array objects»? What is it's relationship with the op parameter? I know I'm no native English speaker, but I still have the feeling that this documentation could be clearer than it is.

Then I found other resources like this which seem to have a completely different approach (no iterators - so, I suppose, manual iteration), but it doesn't even compile without correcting it (still working on it though).

So, please, does anyone here have experience in this regard, who could provide simple examples on how to do that?

like image 508
Camion Avatar asked Jan 27 '23 07:01

Camion


1 Answers

Ok, I finally managed to do it. Since the greatest difficulty was to find good introductory material, I leave an example as sample code. Here are the functions⁽¹⁾ of the API I used or considered using:
(1): described in the documentation

PyArray_Descr *PyArray_DESCR(PyArrayObject* arr)¶

Is a macro which "returns" the PyArray_Descr *descr field of the C PyArrayObject structure, which is a pointer to the dtype property of the array.

int PyArray_NDIM(PyArrayObject *arr)

Is a macro which "returns" the int nd field of the C PyArrayObject structure, which contains the number of dimensions of the array.

npy_intp *PyArray_DIMS(PyArrayObject *arr)
npy_intp *PyArray_SHAPE(PyArrayObject *arr)

are synonym macros which "return" the npy_intp *dimensions field of the C PyArrayObject structure, which points to a C array containing the size for all dimensions of the array, or

npy_intp PyArray_DIM(PyArrayObject* arr, int n)

Which "returns" the nth entry in the previous array (i.e. the size of the nth dimension.

npy_intp *PyArray_STRIDES(PyArrayObject* arr)

or

npy_intp PyArray_STRIDE(PyArrayObject* arr, int n)

are macros which respectively "return" the npy_intp *strides field of the C PyArrayObject structure which points to the (array of) strides of the array, or the nth entry in this array. The strides are the number of bytes to skip between "lines" for all dimensions of the array. Since the array in contiguous, this shouldn't be needed, but might avoid the program to have to multiply itself, the number of cells by their size.

PyObject* PyArray_NewLikeArray(PyArrayObject* prototype, NPY_ORDER order, PyArray_Descr* descr, int subok)

is a function which creates a new numpy array, having the same shape as the prototype passed as parameter. This array is uninitialized.

PyArray_FILLWBYTE(PyObject* obj, int val)

is a function which calls memset to initialize a given numpy array.

void *PyArray_DATA(PyArrayObject *arr)

Is a macro which "returns" the char *data field of the C PyArrayObject structure, which points to the real data space of the array, which has the same shape as a C array.

Here is the declaration of the PyArrayObject structure, as described in the documentation :

typedef struct PyArrayObject {
    PyObject_HEAD
    char *data;
    int nd;
    npy_intp *dimensions;
    npy_intp *strides;
    PyObject *base;
    PyArray_Descr *descr;
    int flags;
    PyObject *weakreflist;
} PyArrayObject;

And here is the sample code:

#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <numpy/arrayobject.h>


#define G 6.67408E-8L 

void * failure(PyObject *type, const char *message) {
    PyErr_SetString(type, message);
    return NULL;
}

void * success(PyObject *var){
    Py_INCREF(var);
    return var;
}

static PyObject *
Py_grav_c(PyObject *self, PyObject *args)
{
    PyArrayObject *p, *M;
    PyObject *a;
    int i, j, k;
    double *pq0, *pq1, *Mq0, *Mq1, *aq0, *aq1, *p0, *p1, *a0, *a1;


    if (!PyArg_ParseTuple(args, "O!O!", &PyArray_Type, &p, &PyArray_Type, &M))
        return failure(PyExc_RuntimeError, "Failed to parse parameters.");

    if (PyArray_DESCR(p)->type_num != NPY_DOUBLE)
        return failure(PyExc_TypeError, "Type np.float64 expected for p array.");

    if (PyArray_DESCR(M)->type_num != NPY_DOUBLE)
        return failure(PyExc_TypeError, "Type np.float64 expected for M array.");

    if (PyArray_NDIM(p)!=2)
        return failure(PyExc_TypeError, "p must be a 2 dimensionnal array.");

    if (PyArray_NDIM(M)!=1)
        return failure(PyExc_TypeError, "M must be a 1 dimensionnal array.");

    int K = PyArray_DIM(p, 0);     // Number of dimensions you want
    int L = PyArray_DIM(p, 1);     // Number of bodies in the system
    int S0 = PyArray_STRIDE(p, 0); // Normally, the arrays should be contiguous
    int S1 = PyArray_STRIDE(p, 1); // But since they provide this Stride info
    int SM = PyArray_STRIDE(M, 0); // I supposed they might not be (alignment)

    if (PyArray_DIM(M, 0) != L)
        return failure(PyExc_TypeError, 
                       "P and M must have the same number of bodies.");

    a = PyArray_NewLikeArray(p, NPY_ANYORDER, NULL, 0);
    if (a == NULL)
        return failure(PyExc_RuntimeError, "Failed to create output array.");
    PyArray_FILLWBYTE(a, 0);

    // For all bodies except first which has no previous body
    for (i = 1,
         pq0 = (double *)(PyArray_DATA(p)+S1),
         Mq0 = (double *)(PyArray_DATA(M)+SM),
         aq0 = (double *)(PyArray_DATA(a)+S1);
         i < L;
         i++,
         *(void **)&pq0 += S1,
         *(void **)&Mq0 += SM,
         *(void **)&aq0 += S1
         ) {
        // For all previous bodies
        for (j = 0,
            pq1 = (double *)PyArray_DATA(p),
            Mq1 = (double *)PyArray_DATA(M),
            aq1 = (double *)PyArray_DATA(a);
            j < i;
            j++,
            *(void **)&pq1 += S1,
            *(void **)&Mq1 += SM,
            *(void **)&aq1 += S1
             ) {
            // For all dimensions calculate deltas
            long double d[K], d2 = 0, VVV, M0xVVV, M1xVVV;
            for (k = 0,
                 p0 = pq0,
                 p1 = pq1;
                 k<K;
                 k++,
                 *(void **)&p0 += S0,
                 *(void **)&p1 += S0) {
                d[k] = *p1 - *p0;
            }
            // calculate Hypotenuse squared
            for (k = 0, d2 = 0; k<K; k++) {
                d2 += d[k]*d[k];
            }
            // calculate interm. results once for each bodies pair (optimization)
            VVV = G * (d2>0 ? pow(d2, -1.5) : 1); // anonymous intermediate result 
#define LIM = 1
//            VVV = G * pow(max(d2, LIM), -1.5);  // Variation on collision case
            M0xVVV = *Mq0 * VVV;                  // anonymous intermediate result
            M1xVVV = *Mq1 * VVV;                  // anonymous intermediate result
            // For all dimensions calculate component of acceleration
            for (k = 0,
                 a0 = aq0,
                 a1 = aq1;
                 k<K;
                 k++,
                 *(void **)&a0 += S0,
                 *(void **)&a1 += S0) {
                *a0 += M1xVVV*d[k];
                *a1 -= M0xVVV*d[k];
            }
        }
    }

    /*  clean up and return the result */
    return success(a);
}



// exported functions list

static PyMethodDef grav_c_Methods[] = {
    {"grav_c", Py_grav_c, METH_VARARGS, "grav_c(p, M)\n"
"\n"
"grav_c takes the positions and masses of m bodies in Newtonian"
" attraction in a n dimensional universe,\n"
"and returns the accelerations each body undergoes.\n"
"input data take the for of a row of fload64 for each dimension of the"
" position (in p) and one row for the masses.\n"
"It returns and array of the same shape as p for the accelerations."},
    {NULL, NULL, 0, NULL} // pour terminer la liste.
};


static char grav_c_doc[] = "Compute attractions between n bodies.";



static struct PyModuleDef grav_c_module = {
    PyModuleDef_HEAD_INIT,
    "grav_c",   /* name of module */
    grav_c_doc, /* module documentation, may be NULL */
    -1,         /* size of per-interpreter state of the module,
                 or -1 if the module keeps state in global variables. */
    grav_c_Methods
};



PyMODINIT_FUNC
PyInit_grav_c(void)
{
    // I don't understand why yet, but the program segfaults without this.
    import_array();

    return PyModule_Create(&grav_c_module);
} 
like image 95
Camion Avatar answered Jan 29 '23 23:01

Camion