Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

`vectorize` for each row in Numpy

Suppose I have an n x m Matrix and want to call a function fct on each of its elements. I can do it like:

A = numpy.array(...)
vec_func = numpy.vectorize(fct)
A_out = vec_func(A)

This will strictly apply the function on each of the matrix elements, fct would be a function:

def fct(a_ij):
  # do something with matrix element a(i, j)

Now I'd like the same, but for each row of the matrix:

def fct(row_i):
  # do something with matrix row(i)

Is there a way to do it with numpy.vectorize or similiar?

like image 737
Michael Avatar asked Nov 09 '22 15:11

Michael


1 Answers

Edit: it looks like np.apply_along_axis does what you want. For example:

import numpy as np

def f(x):
    return x * x.sum()

X = np.arange(12).reshape(2, 2, 3)
np.apply_along_axis(f, -1, X)
# array([[[  0,   3,   6],
#         [ 36,  48,  60]],
#
#        [[126, 147, 168],
#         [270, 300, 330]]])

The notes on performance from my original response below still apply.


Original response:

There's no built-in for this, but Python makes it straightforward to define such a context manager yourself. For example:

import numpy as np
from contextlib import wraps

def row_vectorize(f):
    @wraps(f)
    def wrapped_f(X):
        X = np.asarray(X)
        rows = X.reshape(-1, X.shape[-1])
        return np.reshape([f(row) for row in rows],
                          X.shape[:-1] + (-1,))
    return wrapped_f


@row_vectorize
def func(row):
    return row * row.sum()

Now you can use this on arrays of any non-zero dimension:

>>> X_1D = np.arange(3)
>>> func(X_1D)
array([0, 3, 6])

>>> X_2D = np.arange(6).reshape(2, 3)
>>> func(X_2D)
array([[ 0,  3,  6],
       [36, 48, 60]])

>>> X_3D = np.arange(12).reshape((2, 2, 3))
>>> func(X_3D)
array([[[  0,   3,   6],
        [ 36,  48,  60]],

       [[126, 147, 168],
        [270, 300, 330]]])

Performance-wise, np.vectorize is doing something very similar.

If you need faster looping for a custom function applied across an array, you can often construct your method in terms of numpy element-wise operations and aggregate operations; for example this function accomplishes the same thing as the row-vectorized function above, but will be much quicker on large inputs:

def func2(X):
    return X * X.sum(-1, keepdims=True)

If you have a more complicated operation you'd like to apply across rows of an array and the performance of the loops is a bottleneck, the best options are probably to use numba or cython.

like image 59
jakevdp Avatar answered Nov 15 '22 07:11

jakevdp