I am looking for a fast way to compute a rolling-sum, possibly using Numpy. Here is my first approach:
def func1(M, w):
Rtn = np.zeros((M.shape[0], M.shape[1]-w+1))
for i in range(M.shape[1]-w+1):
Rtn[:,i] = np.sum(M[:, i:w+i], axis=1)
return Rtn
M = np.array([[0., 0., 0., 0., 0., 1., 1., 0., 1., 1., 1., 0., 0.],
[0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 1., 1.],
[1., 1., 0., 1., 0., 0., 0., 1., 0., 0., 0., 0., 0.]])
window_size = 4
print func1(M, window_size)
[[ 0. 0. 1. 2. 2. 3. 3. 3. 3. 2.]
[ 1. 2. 2. 1. 1. 0. 0. 0. 1. 2.]
[ 3. 2. 1. 1. 1. 1. 1. 1. 0. 0.]]
I wanted to prevent the window (/sum) from being redone in the loop and hopefully make it much faster so I came up with the following function which limits the sum to only the first and last elements of the rolling window:
def func2(M, w):
output = np.zeros((M.shape[0], M.shape[1]-w+1))
sum = np.sum(M[:, 0:w], axis=1)
output[:,0] = sum
for i in range(w, M.shape[1]):
sum = sum + M[:,i]- M[:,i-w]
output[:,i-w+1] = sum
return output
But to my surprise, func2 is barely faster than func1:
In [251]:
M = np.random.randint(2, size=3000).reshape(3, 1000)
window_size = 100
%timeit func1(M, window_size)
10 loops, best of 3: 20.9 ms per loop
In [252]:
%timeit func2(M, w)
10 loops, best of 3: 15.5 ms per loop
Am I missing something here? Do you guys know a better, I mean faster way to achieve this?
Python numpy sum() function syntax The array elements are used to calculate the sum. If the axis is not provided, the sum of all the elements is returned. If the axis is a tuple of ints, the sum of all the elements in the given axes is returned. We can specify dtype to specify the returned output data type.
Adapted from @Jaime's answer here: https://stackoverflow.com/a/14314054/553404
import numpy as np
def rolling_sum(a, n=4) :
ret = np.cumsum(a, axis=1, dtype=float)
ret[:, n:] = ret[:, n:] - ret[:, :-n]
return ret[:, n - 1:]
M = np.array([[0., 0., 0., 0., 0., 1., 1., 0., 1., 1., 1., 0., 0.],
[0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 1., 1.],
[1., 1., 0., 1., 0., 0., 0., 1., 0., 0., 0., 0., 0.]])
print(rolling_sum(M))
Output
[[ 0. 0. 1. 2. 2. 3. 3. 3. 3. 2.]
[ 1. 2. 2. 1. 1. 0. 0. 0. 1. 2.]
[ 3. 2. 1. 1. 1. 1. 1. 1. 0. 0.]]
Timings
In [7]: %timeit rolling_sum(M, 4)
100000 loops, best of 3: 7.89 µs per loop
In [8]: %timeit func1(M, 4)
10000 loops, best of 3: 70.4 µs per loop
In [9]: %timeit func2(M, 4)
10000 loops, best of 3: 54.1 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With