Sorry, I do not know the protocol for re-asking a question if it doesn't get an answer. This question was asked a few months ago here: Numpy sum between pairs of indices in 2d array
I have a 2-d numpy array (MxN) and two more 1-d arrays (Mx1) that represent starting and ending indices for each row of the 2-d array that I'd like to sum over. I'm looking for the most efficient way to do this in a large array (preferably without having to use a loop, which is what I'm currently doing). An example of what i'd like to do is the following.
>>> random.seed(1234)
>>> a = random.rand(4,4)
>>> print a
[[ 0.19151945 0.62210877 0.43772774 0.78535858]
[ 0.77997581 0.27259261 0.27646426 0.80187218]
[ 0.95813935 0.87593263 0.35781727 0.50099513]
[ 0.68346294 0.71270203 0.37025075 0.56119619]]
>>> b = array([1,0,2,1])
>>> c = array([3,2,4,4])
>>> d = empty(4)
>>> for i in xrange(4):
d[i] = sum(a[i, b[i]:c[i]])
>>> print d
[ 1.05983651 1.05256841 0.8588124 1.64414897]
My problem is similar to the following question, however, I don't think the solution presented there would be very efficient. Numpy sum of values in subarrays between pairs of indices In that question, they want to find the sum of multiple subsets for the same row, so cumsum()
can be used. However, I will only be finding one sum per row, so I don't think this would be the most efficient means of computing the sum.
EDIT Added timing results for all answers so far, including the OP's code following @seberg's comment below, and the OP's method is the fastest:
def sliced_sum_op(a, b, c) :
d = np.empty(a.shape[0])
for i in xrange(a.shape[0]):
d[i] = np.sum(a[i, b[i]:c[i]])
return d
You can still get it done with np.cumsum
with a big speed boost, although it will require storage equivalent to the size of your original array:
def sliced_sum(a, b, c) :
cum = np.cumsum(a, axis=1)
cum = np.hstack((np.zeros((a.shape[0], 1), dtype=a.dtype), cum))
rows = np.arange(a.shape[0])
return cum[rows, c] - cum[rows, b]
Timings are deceptive for your array, because your method is actually slightly faster than this one for small array sizes. But numpy soon wins it over, see the graph below for timings on random square arrays of size (n, n)
:
The above was generated with
import timeit
import matplotlib.pyplot as plt
n = np.arange(10, 1000, 10)
op = np.zeros(n.shape[0])
me = np.zeros(n.shape[0])
th = np.zeros(n.shape[0])
jp = np.zeros(n.shape[0])
for j, size in enumerate(n) :
a = np.random.rand(size, size)
b, c = indices = np.sort(np.random.randint(size + 1,
size=(2, size)), axis=0)
np.testing.assert_almost_equal(sliced_sum_op(a, b, c),
sliced_sum(a, b, c))
np.testing.assert_almost_equal(sliced_sum_op(a, b, c),
sum_between2(a, b, c))
np.testing.assert_almost_equal(sliced_sum_op(a, b, c),
sum_between_mmult(a, b, c))
op[j] = timeit.timeit('sliced_sum_op(a, b, c)',
'from __main__ import sliced_sum_op, a, b, c',
number=10)
me[j] = timeit.timeit('sliced_sum(a, b, c)',
'from __main__ import sliced_sum, a, b, c',
number=10)
th[j] = timeit.timeit('sum_between2(a, b, c)',
'from __main__ import sum_between2, a, b, c',
number=10)
jp[j] = timeit.timeit('sum_between_mmult(a, b, c)',
'from __main__ import sum_between_mmult, a, b, c',
number=10)
plt.subplot(211)
plt.plot(n, op, label='op')
plt.plot(n, me, label='jaime')
plt.plot(n, th, label='thorsten')
plt.plot(n, jp, label='japreiss')
plt.xlabel('n')
plt.legend(loc='best')
plt.show()
I like @Jaime's answer, but here is another approach. You can recast the problem a matrix multiplication.
If you multiply a
by a vector of all ones, each element of the output vector will contain the sum of the corresponding row of a
. To get the d
you need, you can mask out the elements that are excluded from each row, and then multiply by a vector of all ones to get d
.
def sum_between_mmult(ar, b, c):
copy = np.copy(ar)
nrows = ar.shape[0]
ncols = ar.shape[1]
for i in range(nrows):
copy[i, :b[i]] = 0
copy[i, c[i]:] = 0
onevec = np.ones(ncols)
return np.dot(copy, onevec)
Same as @Jaime, I only saw the speedup for larger matrix sizes. I have the feeling that some kind of fancy indexing trick could get rid of the for
loop and give greater speedup. If you don't need the original array, you can overwrite it instead of making a copy, but this didn't yield much speedup in my tests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With