Given the following data (in python 2.7):
import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,14])
b = np.array([8,2,3])
I want to get the sum of the first 8 elements in a, then the sum of the 9 and 10 element and in the end the last 3 (basic the information in b). The desired output is:
[36, 19, 37]
I can do this with for loops and such, but there must be a more pythonic way and a more efficient way of doing!
That's easy with np.split:
result = [part.sum() for part in np.split(a, np.cumsum(b))[:-1]]
print(result)
>>> [36, 19, 37]
                        A much faster way than np.split is:
np.add.reduceat(a, np.r_[0, np.cumsum(b)[:-1]]) 
What this does:
b corresponding to the ranges you want to sum over - for simplicity, you can assign c = np.r_[0, np.cumsum(b)[:-1]] which for your example would be array([0, 8, 10]) - which is 0 followed all but the last element of the cumulative sum of b (np.cumsum(b) -> array([8, 10, 13]) (the domain of np.ufunc.reduceat is exclusive of the endpoint, so we have to get rid of that 13)np.ufunc.reduceat(a, c) reduces  a by ufunc (in this case, add)  over ranges specified by c[i]:c[i+1].  When i+1 would overflow c, it instead reduces over c[i]:-1
reduce just condenses an array to a single value.  For example, np.add.reduce(a) is equivalent to (but slower than) np.sum(a) (which is in turn slower than a.sum()).  However, since reduceat pushes the for loop in the answer by @jdehsa out of python and into numpy core compiled c-code, it is much faster.Speed test:
b = np.random.randint(1,10,(10000,))
a = np.random.randint(1,10,(np.sum(b),))
%timeit np.add.reduceat(a, np.r_[0, np.cumsum(b)[:-1]])
1000 loops, best of 3: 293 µs per loop
%timeit [part.sum() for part in np.split(a, np.cumsum(b))[:-1]]
10 loops, best of 3: 44.6 ms per loop
And with the added benefit of not wasting memory creating a temporary split copy of a
You can use the reduceat method of the np.add ufunc. You just need to add a zero in front of your indices and discard the last index (if it covers the complete array):
>>> import numpy as np
>>> a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,14])
>>> b = np.array([8,2,3])
>>> np.add.reduceat(a, np.append([0], np.cumsum(b)[:-1]))
array([36, 19, 37], dtype=int32)
The [:-1] discards the last index and the np.append([0], adds a zero in front of the indices.
Note that this is a slightly adapted variant of DanielFs answer.
If you don't like the append you could also create a new array yourself containing the indices:
>>> b_sum = np.zeros_like(b)
>>> np.cumsum(b[:-1], out=b_sum[1:])  # insert the cumsum in the b_sum array directly
>>> np.add.reduceat(a, b_sum)
array([36, 19, 37], dtype=int32)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With