Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Returning numpy array from a C extension

For the sake of learning something new, I am currently trying to reimplement the numpy.mean() function in C. It should take a 3D array and return a 2D array with the mean of the elements along axis 0. I manage to calculate the mean of all values, but don't really know how I would return a new array to Python.

My code so far:

#include <Python.h>
#include <numpy/arrayobject.h>

// Actual magic here:
static PyObject*
myexts_std(PyObject *self, PyObject *args)
{
    PyArrayObject *input=NULL;
    int i, j, k, x, y, z, dims[2];
    double out = 0.0; 

    if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &input))
        return NULL;

    x = input->dimensions[0];
    y = input->dimensions[1];
    z = input->dimensions[2];

    for(k=0;k<z;k++){
        for(j=0;j<y;j++){
            for(i=0;i < x; i++){
                out += *(double*)(input->data + i*input->strides[0] 
+j*input->strides[1] + k*input->strides[2]);
            }
        }
    }
    out /= x*y*z;
    return Py_BuildValue("f", out);
}

// Methods table - this defines the interface to python by mapping names to
// c-functions    
static PyMethodDef myextsMethods[] = {
    {"std", myexts_std, METH_VARARGS,
        "Calculate the standard deviation pixelwise."},
    {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC initmyexts(void)
{
    (void) Py_InitModule("myexts", myextsMethods);
    import_array();
}

What I understand so far (and please correct me if I'm wrong) is that I need to create a new PyArrayObject, which will be my output (maybe with PyArray_FromDims ?). Then I need an array of adresses to the memory of this array and fill it with data. How would I go about this?

EDIT:

After doing some more reading on pointers (here: http://pw1.netcom.com/~tjensen/ptr/pointers.htm), I achieved what I was aiming at. Now another question arises: Where would I find the origingal implementation of numpy.mean()? I'd like to see how it is, that the python operation is so much faster than my version. I assume it avoids the ugly looping.

Here is my solution:

static PyObject*
myexts_std(PyObject *self, PyObject *args)
{
    PyArrayObject *input=NULL, *output=NULL; // will be pointer to actual numpy array ?
    int i, j, k, x, y, z, dims[2]; // array dimensions ?
    double *out = NULL;
    if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &input))
        return NULL;

    x = input->dimensions[0];
    y = dims[0] = input->dimensions[1];
    z = dims[1] = input->dimensions[2];
    output = PyArray_FromDims(2, dims, PyArray_DOUBLE);    
    for(k=0;k<z;k++){
        for(j=0;j<y;j++){
            out = output->data + j*output->strides[0] + k*output->strides[1];
            *out = 0;
            for(i=0;i < x; i++){
                *out += *(double*)(input->data + i*input->strides[0] +j*input->strides[1] + k*input->strides[2]);
            }
            *out /= x;
        }
    }
    return PyArray_Return(output);
}
like image 747
karlson Avatar asked May 16 '13 10:05

karlson


People also ask

Can you use NumPy in C?

NumPy is written in C, and executes very quickly as a result. By comparison, Python is a dynamic language that is interpreted by the CPython interpreter, converted to bytecode, and executed. While it's no slouch, compiled C code is always going to be faster.

What is __ Array_interface __?

__array_interface__ A dictionary of items (3 required and 5 optional). The optional keys in the dictionary have implied defaults if they are not provided. The keys are: shape (required) Tuple whose elements are the array size in each dimension.

Is NumPy an extension?

Numpy is a extension package to Python for multi-dimensional arrays. It is designed for scientific computation. It is also Known as array oriented computing.

Are NumPy arrays C arrays?

A NumPy array is a Python object implemented using Python's C API. NumPy arrays do provide an API at the C level, but they cannot be created independent from the Python interpreter. They are especially useful because of all the different array manipulation routines available in NumPy and SciPy.


1 Answers

The Numpy API has a function PyArray_Mean that accomplishes what you're trying to do without the "ugly looping" ;).

static PyObject *func1(PyObject *self, PyObject *args) {
    PyArrayObject *X, *meanX;
    int axis;

    PyArg_ParseTuple(args, "O!i", &PyArray_Type, &X, &axis);
    meanX = (PyArrayObject *) PyArray_Mean(X, axis, NPY_DOUBLE, NULL);

    return PyArray_Return(meanX);
}
like image 71
Joel Vroom Avatar answered Sep 17 '22 06:09

Joel Vroom