Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing a numpy array to C++

I have some code writen in Python for which the output is a numpy array, and now I want to send that output to C++ code, where the heavy part of the calculations will be performed.

I have tried using cython's public cdef, but I am running on some issues. I would appreciate your help! Here goes my code:

pymodule.pyx:

from pythonmodule import result # result is my numpy array
import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
cdef public void cfunc():
    print 'I am in here!!!'
    cdef np.ndarray[np.float64_t, ndim=2, mode='c'] res = result
    print res

Once this is cythonized, I call:

pymain.c:

#include <Python.h>
#include <numpy/arrayobject.h>
#include "pymodule.h"

int main() {
  Py_Initialize();
  initpymodule();
  test(2);
  Py_Finalize();
}

int test(int a)
{
    Py_Initialize();
    initpymodule();
    cfunc();
    return 0;
}

I am getting a NameError for the result variable at C++. I have tried defining it with pointers and calling it indirectly from other functions, but the array remains invisible. I am pretty sure the answer is quite simple, but I just do not get it. Thanks for your help!

like image 648
user3225486 Avatar asked Jun 04 '16 08:06

user3225486


People also ask

How to pass a pointer to a function in C using NumPy?

That why we need the use ctypes library to specify the C data type that we are passing to our C function: The ctypes.data is an attribute of numpy that returns a pointer to the array, we use c_void_p to specify that we are passing a pointer to our function.

How to iterate over a matrix in C with NumPy?

The C function takes a pointer to the numpy array, then we use malloc to allocate enough space for our resulting array. Then we iterate over the matrix using a double for loop. Notice that the first for is from 0 to n*m with increments of n , this is because we need to iterate over the 2d array as if it were a 1d array.

Is it possible to call a NumPy function from c?

However, sometimes this is not appropriate or convenient, and you just want to write a function in C and then call it directly from Python. Numpy is one of the most popular packages for scientific computing in Python. It allows you to create very efficient matrices and vectors in Python with a C backend.

What is the use of C_void_P in NumPy?

The ctypes.data is an attribute of numpy that returns a pointer to the array, we use c_void_p to specify that we are passing a pointer to our function. In the same way we use c_int to indicate that we are passing a data of type int.


1 Answers

Short Answer

The NameError was cause by the fact that Python couldn't find the module, the working directory isn't automatically added to your PYTHONPATH. Using setenv with setenv("PYTHONPATH", ".", 1); in your C/C++ code fixes this.

Longer Answer

There's an easy way to do this, apparently. With a python module pythonmodule.py containing an already created array:

import numpy as np

result = np.arange(20, dtype=np.float).reshape((2, 10))

You can structure your pymodule.pyx to export that array by using the public keyword. By adding some auxiliary functions, you'll generally won't need to touch neither the Python, nor the Numpy C-API:

from pythonmodule import result
from libc.stdlib cimport malloc
import numpy as np
cimport numpy as np


cdef public np.ndarray getNPArray():
    """ Return array from pythonmodule. """
    return <np.ndarray>result

cdef public int getShape(np.ndarray arr, int shape):
    """ Return Shape of the Array based on shape par value. """
    return <int>arr.shape[1] if shape else <int>arr.shape[0]

cdef public void copyData(float *** dst, np.ndarray src):
    """ Copy data from src numpy array to dst. """
    cdef float **tmp
    cdef int i, j, m = src.shape[0], n=src.shape[1];

    # Allocate initial pointer 
    tmp = <float **>malloc(m * sizeof(float *))
    if not tmp:
        raise MemoryError()

    # Allocate rows
    for j in range(m):
        tmp[j] = <float *>malloc(n * sizeof(float))
        if not tmp[j]:
            raise MemoryError()

    # Copy numpy Array
    for i in range(m):
        for j in range(n):
            tmp[i][j] = src[i, j]

    # Assign pointer to dst
    dst[0] = tmp

Function getNPArray and getShape return the array and its shape, respectively. copyData was added in order to just extract the ndarray.data and copy it so you can then finalize Python and work without having the interpreter initialized.

A sample program (in C, C++ should look identical) would look like this:

#include <Python.h>
#include "numpy/arrayobject.h"
#include "pyxmod.h"
#include <stdio.h>

void printArray(float **arr, int m, int n);
void getArray(float ***arr, int * m, int * n);

int main(int argc, char **argv){
    // Holds data and shapes.
    float **data = NULL;
    int m, n;

    // Gets array and then prints it.
    getArray(&data, &m, &n);
    printArray(data, m, n);

    return 0;
}

void getArray(float ***data, int * m, int * n){
    // setenv is important, makes python find 
    // modules in working directory
    setenv("PYTHONPATH", ".", 1);

    // Initialize interpreter and module
    Py_Initialize();
    initpyxmod();

    // Use Cython functions.
    PyArrayObject *arr = getNPArray();
    *m = getShape(arr, 0);
    *n = getShape(arr, 1);

    copyData(data, arr);

    if (data == NULL){  //really redundant.
        fprintf(stderr, "Data is NULL\n");
        return ;
    }

    Py_DECREF(arr);
    Py_Finalize();
}

void printArray(float **arr, int m, int n){
    int i, j;
    for(i=0; i < m; i++){
        for(j=0; j < n; j++)
            printf("%f ", arr[i][j]);

        printf("\n");
    }
}

Always remember to set:

setenv("PYTHONPATH", ".", 1);

before you call Py_Initialize so Python can find modules in the working directory.

The rest is pretty straight-forward. It might need some additional error-checking and definitely needs a function to free the allocated memmory.

Alternate Way w/o Cython:

Doing it the way you are attempting is way hassle than it's worth, you would probably be better off using numpy.save to save your array in a npy binary file and then use some C++ library that reads that file for you.

like image 87
Dimitris Fasarakis Hilliard Avatar answered Oct 15 '22 00:10

Dimitris Fasarakis Hilliard