Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sliding standard deviation on a 1D NumPy array

Suppose that you have an array and want to create another array, which's values are equal to standard deviation of first array's 10 elements successively. With the help of for loop, it can be written easily like below code. What I want to do is avoid using for loop for faster execution time. Any suggestions?

Code
a = np.arange(20)
b = np.empty(11)
for i in range(11):
    b[i] = np.std(a[i:i+10])
like image 719
Elgin Cahangirov Avatar asked Feb 05 '23 15:02

Elgin Cahangirov


1 Answers

You could create a 2D array of sliding windows with np.lib.stride_tricks.as_strided that would be views into the given 1D array and as such won't be occupying any more memory. Then, simply use np.std along the second axis (axis=1) for the final result in a vectorized way, like so -

W = 10 # Window size
nrows = a.size - W + 1
n = a.strides[0]
a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,W),strides=(n,n))
out = np.std(a2D, axis=1)

Runtime test

Function definitions -

def original_app(a, W):
    b = np.empty(a.size-W+1)
    for i in range(b.size):
        b[i] = np.std(a[i:i+W])
    return b
    
def vectorized_app(a, W):
    nrows = a.size - W + 1
    n = a.strides[0]
    a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,W),strides=(n,n))
    return np.std(a2D,1)

Timings and verification -

In [460]: # Inputs
     ...: a = np.arange(10000)
     ...: W = 10
     ...: 

In [461]: np.allclose(original_app(a, W), vectorized_app(a, W))
Out[461]: True

In [462]: %timeit original_app(a, W)
1 loops, best of 3: 522 ms per loop

In [463]: %timeit vectorized_app(a, W)
1000 loops, best of 3: 1.33 ms per loop

So, around 400x speedup there!

For completeness, here's the equivalent pandas version -

import pandas as pd

def pdroll(a, W): # a is 1D ndarray and W is window-size
    return pd.Series(a).rolling(W).std(ddof=0).values[W-1:]
like image 117
Divakar Avatar answered Feb 08 '23 14:02

Divakar