Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient 2d cumsum

Say I have an array like this

>>> a = np.arange(1,8).reshape((1,-1))
>>> a
array([[1, 2, 3, 4, 5, 6, 7]])

and I want to create, for each of the items in a, a "cumsum of the next 4 items". That is, my expected output is

1,       2,      3, 4, 5, 6, 7, 8
1+2,     2+3,     ...
1+2+3    2+3+4    ...
1+2+3+4  2+3+4+5  ...

i.e. a matrix that contains

1, 2, 3, 4, 5, 0, 0, 0
3, 5, 7, 9, 11,0, 0, 0
6, 9, 12,15,18,0, 0, 0
10,14,18,21,26,0, 0, 0

Since the cumsum operation cannot be correctly done for the last 3 items, I expect a 0 there. I know how to do a single cumsum. In fact, the arrays are

a[:4].cumsum().reshape((-1,1)); a[1:5].cumsum().reshape((-1,1))...

stacked horizontally. However, I don't know how to do this in an efficient way. What would be the nice vectorized numpy way of doing this? I'm also open for scipy packages, as long as they dominate numpy in terms of efficiency or readability.

like image 903
FooBar Avatar asked Jul 28 '15 12:07

FooBar


1 Answers

You can do your calculations efficiently using a simpler variant of a technique called summed area table, also known as integral image in image processing applications. First you calculate and store your summed area table, a complete cumsum of your first row with a 0 added in front:

a = np.arange(1, 8)
cs = np.concatenate(([0], np.cumsum(a)))

And you can now create each of your "cumsum of the next n items" as cs[:n] - cs[:-n]:

>>> for n in range(1, 5):
...     print n, '-->', (cs[n:] - cs[:-n])[:4]
...
1 --> [1 2 3 4]
2 --> [3 5 7 9]
3 --> [ 6  9 12 15]
4 --> [10 14 18 22]

You'll need to properly arrange them in the shape you want, but once the original calculation is done, you can compute each item of your output with a single subtraction, which is about as efficient as it can get.

like image 93
Jaime Avatar answered Nov 14 '22 03:11

Jaime