Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generalized Universal Function in numpy

I'm trying make a generalized ufunc using numpy API. The inputs are one (n x m) matrix and a scalar, and outputs are two matrix ((n x p) and (p x m)). But I don't knowing how to do it. Someone could help me? In init function I'm using PyUFunc_FromFuncAndDataAndSignature function with signature:

"(n,m),()->(n,p),(p,m)"

I can read the inputs (matrix and scalar), but I wanted to use the scalar input like the dimension p in signature. Is it possible?

Here a example code that just print the inputs:

#include "Python.h"
#include "math.h"
#include "numpy/ndarraytypes.h"
#include "numpy/ufuncobject.h"

static PyMethodDef nmfMethods[] = {
        {NULL, NULL, 0, NULL}
};


static void double_nmf(char **args, npy_intp *dimensions,
                            npy_intp* steps, void* data)
{
    npy_intp i, j, 
             n = dimensions[1], //dimensions of input matrix
             m = dimensions[2]; //

    printf("scalar: %d\n",*(int*)args[1]); // input scalar

    // just print input matrix
    printf("Input matrix:\n");
    for(i=0;i<n;i++){
        for(j=0;j<m;j++){
            printf("%.1f ",*(double*)(args[0]+8*(i*m+j)));
        }
    printf("\n");
    }
    return;

}

static PyUFuncGenericFunction nmf_functions[] = { double_nmf };
static void * nmf_data[] = { (void *)NULL };
static char nmf_signatures[] = { PyArray_DOUBLE, PyArray_INT, PyArray_DOUBLE, PyArray_DOUBLE };
char *nmf_signature = "(n,m),()->(n,p),(p,m)";

PyMODINIT_FUNC initnmf(void)
{
    PyObject *m, *d, *version, *nmf;

    m = Py_InitModule("nmf", nmfMethods);
    if (m == NULL) {
        return;
    }

    import_array();
    import_umath();
    d = PyModule_GetDict(m);
    version = PyString_FromString("0.1");
    PyDict_SetItemString(d, "__version__", version);
    Py_DECREF(version);

    nmf = PyUFunc_FromFuncAndDataAndSignature(nmf_functions, nmf_data, nmf_signatures, 1,
                                    2, 2, PyUFunc_None, "nmf",
                                    "", 0, nmf_signature);
    PyDict_SetItemString(d, "nmf", nmf);
    Py_DECREF(nmf);
}

This code compiles and works. The python script is here:

#/usr/bin/python

import numpy as np
import nmf

x = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]])
y,z = nmf.nmf(x,2)
print "Shapes of outputs: ", y.shape, z.shape

And the terminal output is:

scalar: 2
Input matrix:
1.0 2.0 3.0 4.0 5.0 
6.0 7.0 8.0 9.0 10.0 
11.0 12.0 13.0 14.0 15.0 
16.0 17.0 18.0 19.0 20.0 
Shapes of outputs:  (4, 1) (1, 5)

My doubt is how use the scalar input (2 in the case) like dimension p of outputs matrices. In example p = 1, and I don't set it.

like image 537
fabincarmo Avatar asked Oct 20 '22 00:10

fabincarmo


2 Answers

There is no way to set a dimension in a gufunc, other than by providing an array of a certain size. The 1 is the value that all dimensions are initialized to internally, and you shouldn't rely on that not changing. My personal view on that is that an undefined dimension should raise an error.

The only thing you can do to set p, is to create an empty array with the right shape and pass it in as an output array. To achieve it, you would redefine your nmf to have signature "(m,n)->(m,p),(p,n)" and wrap it with some Python similar to:

def nmf_wrap(x, p):
    x = np.asarray(x)
    assert x.ndim >= 2
    shape = x.shape[:-2]
    m, n = x.shape[-2:]
    out1 = np.empty(shape+(m, p), dtype=x.dtype)
    out2 = np.empty(shape+(p, n), dtype=x.dtype)
    return nmf.nmf(x, out1, out2)

There has been some ongoing discussion on extending the functionality provided by gufuncs signatures on the numpy-dev mailing list recently. What you describe has some similarity with what I call there "computed dimensions." If you would like to see something implemented in numpy 1.10, it would be great if you could explain your use case in more detail on that list: we don't know of many (any?) gufunc coders out in the wild!

like image 104
Jaime Avatar answered Oct 22 '22 15:10

Jaime


Thanks @jaime for your answer, helped me a lot. I made the changes that you suggested and it works. Here the C example code. It just copies some input matrix elements to outputs.

#include "Python.h"
#include "numpy/ndarraytypes.h"
#include "numpy/ufuncobject.h"

static PyMethodDef nmfMethods[] = {
        {NULL, NULL, 0, NULL}
};


static void double_nmf(char **args, npy_intp *dimensions,
                            npy_intp* steps, void* data)
{
    npy_intp i, j,
            n = dimensions[1],
            m = dimensions[2],
            p = dimensions[3];
    char *in = args[0], *out1 = args[1], *out2 = args[2];

    for(i=0; i<n; i++){
        for(j=0; j<p; j++){
            *(double*)(out1 + 8*(j + p*i)) = *(double*)(in + 8*(j + m*i));
        }
    }
    for(i=0; i<p; i++){
        for(j=0; j<m; j++){
            *(double*)(out2 + 8*(j + m*i)) = *(double*)(in + 8*(j + m*i));
        }
    }
    return;

}

static PyUFuncGenericFunction nmf_functions[] = { double_nmf };
static void * nmf_data[] = { (void *)NULL };
static char nmf_signatures[] = { PyArray_DOUBLE, PyArray_DOUBLE, PyArray_DOUBLE };
char *nmf_signature = "(n,m)->(n,p),(p,m)";

PyMODINIT_FUNC initnmf(void)
{
    PyObject *m, *d, *version, *nmf;

    m = Py_InitModule("nmf", nmfMethods);
    if (m == NULL) {
        return;
    }

    import_array();
    import_umath();
    d = PyModule_GetDict(m);
    version = PyString_FromString("0.1");
    PyDict_SetItemString(d, "__version__", version);
    Py_DECREF(version);

    nmf = PyUFunc_FromFuncAndDataAndSignature(nmf_functions, nmf_data, nmf_signatures, 1,
                                    1, 2, PyUFunc_None, "nmf",
                                    "", 0, nmf_signature);
    PyDict_SetItemString(d, "nmf", nmf);
    Py_DECREF(nmf);
}

Here the Python example code and the terminal output.

#/usr/bin/python

import numpy as np
import nmf

def nmf_wrap(x,p):
    x = np.asarray(x)
    assert x.ndim >=2
    shape = x.shape[-2:]
    n,m = shape[-2:]
    out1 = np.empty((n, p), dtype=x.dtype)
    out2 = np.empty((p, m), dtype=x.dtype)
    return nmf.nmf(x, out1, out2)

x = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]])
y,z = nmf_wrap(x,2)
print 'Input:\n', x
print 'Output 1:\n', y
print 'Output 2:\n', z

Input:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]]
Output 1:
[[ 1  2]
 [ 6  7]
 [11 12]
 [16 17]]
Output 2:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

Now I can continue to program.

I'm also writing the non-ufunc, using Python/C API (no numpy), as @nathaniel-j-smith suggested (that's what I got it.). I hope to have some results and compare the two approaches briefly.

like image 39
fabincarmo Avatar answered Oct 22 '22 15:10

fabincarmo