I have an application where I need to sum across arbitrary groups of indices in a 3D NumPy array. The built-in NumPy array sum routine sums up all indices along one of the dimensions of an ndarray. Instead, I need to sum up ranges of indices along one of the dimensions in my array and return a new array.
For example, let's assume that I have an ndarray with shape (70,25,3)
. I wish to sum up the first dimension along certain index ranges and return a new 3D array. Consider the sum from 0:25, 25:50
and 50:75
which would return an array of shape (3,25,3)
.
Is there an easy way to do "disjoint sums" along one dimension of a NumPy array to produce this result?
You can use np.add.reduceat
as a general approach to this problem. This works even if the ranges are not all the same length.
To sum the slices 0:25
, 25:50
and 50:75
along axis 0, pass in indices [0, 25, 50]
:
np.add.reduceat(a, [0, 25, 50], axis=0)
This method can also be used to sum non-contiguous ranges. For instance, to sum the slices 0:25
, 37:47
and 51:75
, write:
np.add.reduceat(a, [0,25, 37,47, 51], axis=0)[::2]
An alternative approach to summing ranges of the same length is to reshape the array and then sum along an axis. The equivalent to the first example above would be:
a.reshape(3, a.shape[0]//3, a.shape[1], a.shape[2]).sum(axis=1)
Just sum each portion and use the results to create a new array.
import numpy as np
i1, i2 = (2,7)
a = np.ones((10,5,3))
b = np.sum(a[0:i1,...], 0)
c = np.sum(a[i1:i2,...], 0)
d = np.sum(a[i2:,...], 0)
g = np.array([b,c,d])
>>> g.shape
(3, 5, 3)
>>> g
array([[[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.]],
[[ 5., 5., 5.],
[ 5., 5., 5.],
[ 5., 5., 5.],
[ 5., 5., 5.],
[ 5., 5., 5.]],
[[ 3., 3., 3.],
[ 3., 3., 3.],
[ 3., 3., 3.],
[ 3., 3., 3.],
[ 3., 3., 3.]]])
>>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With