Let's assume we have a C function that takes a set of one or more input arrays, processes them, and writes its output into a set of output arrays. The signature looks as follows (with count
representing the number of array elements to be processed):
void compute (int count, float** input, float** output)
I want to call this function from Python via ctypes and use it to apply a transformation to a set of NumPy arrays. For a one-input/one-output function defined as
void compute (int count, float* input, float* output)
the following works:
import ctypes
import numpy
from numpy.ctypeslib import ndpointer
lib = ctypes.cdll.LoadLibrary('./block.so')
fun = lib.compute
fun.restype = None
fun.argtypes = [ctypes.c_int,
ndpointer(ctypes.c_float),
ndpointer(ctypes.c_float)]
data = numpy.ones(1000).astype(numpy.float32)
output = numpy.zeros(1000).astype(numpy.float32)
fun(1000, data, output)
However, I have no clue how to create the corresponding pointer array for multiple inputs (and/or outputs). Any ideas?
Edit: So people have been wondering how compute
knows how many array pointers to expect (as count
refers to the number of elements per array). This is, in fact, hard-coded; a given compute
knows precisely how many inputs and outputs to expect. It's the caller's job to verify that input
and output
point to the right number of inputs and outputs. Here's an example compute
taking 2 inputs and writing to 1 output array:
virtual void compute (int count, float** input, float** output) {
float* input0 = input[0];
float* input1 = input[1];
float* output0 = output[0];
for (int i=0; i<count; i++) {
float fTemp0 = (float)input1[i];
fRec0[0] = ((0.09090909090909091f * fTemp0) + (0.9090909090909091f * fRec0[1]));
float fTemp1 = (float)input0[i];
fRec1[0] = ((0.09090909090909091f * fTemp1) + (0.9090909090909091f * fRec1[1]));
output0[i] = (float)((fTemp0 * fRec1[0]) - (fTemp1 * fRec0[0]));
// post processing
fRec1[1] = fRec1[0];
fRec0[1] = fRec0[0];
}
}
I have no way of influencing the signature and implementation of compute
. I can verify (from Python!) how many inputs and outputs are required. Key problem is how to give the correct argtypes
for the function, and how to produce appropriate data structures in NumPy (an array of pointers to NumPy arrays).
To do this specifically with Numpy arrays, you could use:
import numpy as np
import ctypes
count = 5
size = 1000
#create some arrays
arrays = [np.arange(size,dtype="float32") for ii in range(count)]
#get ctypes handles
ctypes_arrays = [np.ctypeslib.as_ctypes(array) for array in arrays]
#Pack into pointer array
pointer_ar = (ctypes.POINTER(C.c_float) * count)(*ctypes_arrays)
ctypes.CDLL("./libfoo.so").foo(ctypes.c_int(count), pointer_ar, ctypes.c_int(size))
Where the C side of things might look like:
# function to multiply all arrays by 2
void foo(int count, float** array, int size)
{
int ii,jj;
for (ii=0;ii<count;ii++){
for (jj=0;jj<size;jj++)
array[ii][jj] *= 2;
}
}
In C, float**
points to first element in a table/array of float*
pointers.
Presumably each of those float*
points to first element in a table/array of float
values.
Your function declaration has 1 count, however it's not clear what this count applies to:
void compute (int count, float** input, float** output)
count
x count
in size?count
-sized array of float*
each somehow terminated, e.g. with nan
?float*
each of count
elements (reasonable assumption)?Please clarify your question and I will clarify my answer :-)
Assuming the last API interpretation, here's my sample compute function:
/* null-terminated array of float*, each points to count-sized array
*/
extern void compute(int count, float** in, float** out)
{
while (*in)
{
for (int i=0; i<count; i++)
{
(*out)[i] = (*in)[i]*42;
}
in++; out++;
}
}
Test code for the sample compute function:
#include <stdio.h>
extern void compute(int count, float** in, float** out);
int main(int argc, char** argv)
{
#define COUNT 3
float ina[COUNT] = { 1.5, 0.5, 3.0 };
float inb[COUNT] = { 0.1, -0.2, -10.0 };
float outa[COUNT];
float outb[COUNT];
float* in[] = {ina, inb, (float*)0};
float* out[] = {outa, outb, (float*)0};
compute(COUNT, in, out);
for (int row=0; row<2; row++)
for (int c=0; c<COUNT; c++)
printf("%d %d %f %f\n", row, c, in[row][c], out[row][c]);
return 0;
}
And how you use same via ctypes in Python for count
== 10 float
subarrays and size 2
float*
array, containing 1 real subarray and NULL terminator:
import ctypes
innertype = ctypes.ARRAY(ctypes.c_float, 10)
outertype = ctypes.ARRAY(ctypes.POINTER(ctypes.c_float), 2)
in1 = innertype(*range(10))
in_ = outertype(in1, None)
out1 = innertype(*range(10))
out = outertype(out1, None)
ctypes.CDLL("./compute.so").compute(10, in_, out)
for i in range(10): print in_[0][i], out[0][i]
Numpy interface to ctypes is covered here http://www.scipy.org/Cookbook/Ctypes#head-4ee0c35d45f89ef959a7d77b94c1c973101a562f, arr.ctypes.shape[:] arr.ctypes.strides[:] and arr.ctypes.data are what you need; you might be able to feed that directly to your compute
.
Here's an example:
In [55]: a = numpy.array([[0.0]*10]*2, dtype=numpy.float32)
In [56]: ctypes.cast(a.ctypes.data, ctypes.POINTER(ctypes.c_float))[0]
Out[56]: 0.0
In [57]: ctypes.cast(a.ctypes.data, ctypes.POINTER(ctypes.c_float))[0] = 1234
In [58]: a
Out[58]:
array([[ 1234., 0., 0., 0., 0., 0., 0., 0.,
0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0.]], dtype=float32)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With