Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to apply a numpy function that takes two 1d arrays as arguments on each row of two 2d arrays together?

I am trying to run something like:

 np.bincount(array1, weights = array2, minlength=7)

where both array1 and array2 are 2d n numpy arrays of shape (m,n). My desired goal is that np.bincount() is run n times with each row of array1 and array2

I have tried using np.apply_along_axis() but as far as I can tell this only allows for the function to be run on each row of array1 without using each row of array2 as arguments for np.bincount. I was hoping to find a way to do this cleanly with a numpy function rather than iteration as this is a performance critical function but so far can't find another way.

For Example, given these arrays:

array1 = [[1,2,3],[4,5,6]]
array2  = [[7,8,9],[10,11,12]]

I would want to compute:

[np.bincounts([1,2,3], weights = [7,8,9],minlength=7),  np.bincounts([4,5,6], weights = [10,11,12], minlength=7)]
like image 970
Vhagar Avatar asked Sep 12 '25 13:09

Vhagar


1 Answers

A simple solution is simply to use comprehension lists:

result = [np.bincount(v, weights=w) for v,w in zip(array1, array2)]

Because the resulting arrays can have a different size (and actually do have a different size in your example), the result cannot be a Numpy array but a regular list. Most Numpy function are not able to work on a list of variable-sized arrays or even produce them.

If you have a lot of row in the arrays, you can mitigate the cost of the CPython interpreter loops using the Numba's JIT (or eventually Cython in this case). Note that the input arrays must be converted in Numpy arrays before calling the Numba function for sake of performance. If you know that all the arrays are of the same size, you can write a more efficient implementation using Numba (by preallocating the resulting array and doing the bincount yourself).


Update

With fixed-size arrays, here is a fast implementation in Numba:

import numpy as np
import numba as nb

array1 = np.array([[1,2,3],[4,5,6]], dtype=np.int32)
array2  = np.array([[7,8,9],[10,11,12]], dtype=np.int32)

@nb.njit('i4[:,::1](i4[:,::1],i4[:,::1])')
def compute(array1, array2):
    assert array1.shape == array2.shape
    n, m = array1.shape
    res = np.zeros((n, 7), dtype=np.int32)
    for i in range(n):
        for j in range(m):
            v = array1[i, j]
            assert v>=0 and v<7  # Can be removed if the input is safe
            res[i, v] += array2[i, j]
    return res

result = compute(array1, array2)

# result is
# array([[ 0,  7,  8,  9,  0,  0,  0],
#       [ 0,  0,  0,  0, 10, 11, 12]])
like image 117
Jérôme Richard Avatar answered Sep 15 '25 03:09

Jérôme Richard