Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast rolling-sum for list of data vectors (2d matrix)

I am looking for a fast way to compute a rolling-sum, possibly using Numpy. Here is my first approach:

 def func1(M, w):
     Rtn = np.zeros((M.shape[0], M.shape[1]-w+1))
     for i in range(M.shape[1]-w+1):
         Rtn[:,i] = np.sum(M[:, i:w+i], axis=1)
     return Rtn

 M = np.array([[0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  1.,  1.,  1.,  0.,  0.],
               [0.,  0.,  1.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.],
               [1.,  1.,  0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.]])

 window_size = 4
 print func1(M, window_size)

 [[ 0.  0.  1.  2.  2.  3.  3.  3.  3.  2.]
  [ 1.  2.  2.  1.  1.  0.  0.  0.  1.  2.]
  [ 3.  2.  1.  1.  1.  1.  1.  1.  0.  0.]]

I wanted to prevent the window (/sum) from being redone in the loop and hopefully make it much faster so I came up with the following function which limits the sum to only the first and last elements of the rolling window:

 def func2(M, w):
     output = np.zeros((M.shape[0], M.shape[1]-w+1))
     sum = np.sum(M[:, 0:w], axis=1)
     output[:,0] = sum

     for i in range(w, M.shape[1]):
         sum = sum + M[:,i]- M[:,i-w]
         output[:,i-w+1] = sum
     return output

But to my surprise, func2 is barely faster than func1:

 In [251]:
 M = np.random.randint(2, size=3000).reshape(3, 1000)

 window_size = 100
 %timeit func1(M, window_size)
 10 loops, best of 3: 20.9 ms per loop

 In [252]:
 %timeit func2(M, w)
 10 loops, best of 3: 15.5 ms per loop

Am I missing something here? Do you guys know a better, I mean faster way to achieve this?

like image 344
user3329302 Avatar asked Feb 02 '15 22:02

user3329302


People also ask

How do you sum all elements in a matrix in python?

Python numpy sum() function syntax The array elements are used to calculate the sum. If the axis is not provided, the sum of all the elements is returned. If the axis is a tuple of ints, the sum of all the elements in the given axes is returned. We can specify dtype to specify the returned output data type.


1 Answers

Adapted from @Jaime's answer here: https://stackoverflow.com/a/14314054/553404

import numpy as np

def rolling_sum(a, n=4) :
    ret = np.cumsum(a, axis=1, dtype=float)
    ret[:, n:] = ret[:, n:] - ret[:, :-n]
    return ret[:, n - 1:]

M = np.array([[0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  1.,  1.,  1.,  0.,  0.],
              [0.,  0.,  1.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.],
              [1.,  1.,  0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.]])

print(rolling_sum(M)) 

Output

[[ 0.  0.  1.  2.  2.  3.  3.  3.  3.  2.]
 [ 1.  2.  2.  1.  1.  0.  0.  0.  1.  2.]
 [ 3.  2.  1.  1.  1.  1.  1.  1.  0.  0.]]

Timings

In [7]: %timeit rolling_sum(M, 4)
100000 loops, best of 3: 7.89 µs per loop

In [8]: %timeit func1(M, 4)
10000 loops, best of 3: 70.4 µs per loop

In [9]: %timeit func2(M, 4)
10000 loops, best of 3: 54.1 µs per loop
like image 184
YXD Avatar answered Sep 20 '22 01:09

YXD