I am trying to run something like:
np.bincount(array1, weights = array2, minlength=7)
where both array1
and array2
are 2d n numpy arrays of shape (m,n). My desired goal is that np.bincount()
is run n times with each row of array1 and array2
I have tried using np.apply_along_axis() but as far as I can tell this only allows for the function to be run on each row of array1 without using each row of array2 as arguments for np.bincount
. I was hoping to find a way to do this cleanly with a numpy function rather than iteration as this is a performance critical function but so far can't find another way.
For Example, given these arrays:
array1 = [[1,2,3],[4,5,6]]
array2 = [[7,8,9],[10,11,12]]
I would want to compute:
[np.bincounts([1,2,3], weights = [7,8,9],minlength=7), np.bincounts([4,5,6], weights = [10,11,12], minlength=7)]
A simple solution is simply to use comprehension lists:
result = [np.bincount(v, weights=w) for v,w in zip(array1, array2)]
Because the resulting arrays can have a different size (and actually do have a different size in your example), the result cannot be a Numpy array but a regular list. Most Numpy function are not able to work on a list of variable-sized arrays or even produce them.
If you have a lot of row in the arrays, you can mitigate the cost of the CPython interpreter loops using the Numba's JIT (or eventually Cython in this case). Note that the input arrays must be converted in Numpy arrays before calling the Numba function for sake of performance. If you know that all the arrays are of the same size, you can write a more efficient implementation using Numba (by preallocating the resulting array and doing the bincount yourself).
With fixed-size arrays, here is a fast implementation in Numba:
import numpy as np
import numba as nb
array1 = np.array([[1,2,3],[4,5,6]], dtype=np.int32)
array2 = np.array([[7,8,9],[10,11,12]], dtype=np.int32)
@nb.njit('i4[:,::1](i4[:,::1],i4[:,::1])')
def compute(array1, array2):
assert array1.shape == array2.shape
n, m = array1.shape
res = np.zeros((n, 7), dtype=np.int32)
for i in range(n):
for j in range(m):
v = array1[i, j]
assert v>=0 and v<7 # Can be removed if the input is safe
res[i, v] += array2[i, j]
return res
result = compute(array1, array2)
# result is
# array([[ 0, 7, 8, 9, 0, 0, 0],
# [ 0, 0, 0, 0, 10, 11, 12]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With