Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sliding window of M-by-N shape numpy.ndarray

I have a Numpy array of shape (6,2):

[[ 0, 1],  [10,11],  [20,21],  [30,31],  [40,41],  [50,51]] 

I need a sliding window with step size 1 and window size 3 like this:

[[ 0, 1,10,11,20,21],  [10,11,20,21,30,31],  [20,21,30,31,40,41],  [30,31,40,41,50,51]] 

I'm looking for a Numpy solution. If your solution could parametrise the shape of the original array as well as the window size and step size, that'd be great.


I found this related answer Using strides for an efficient moving average filter but I don't see how to specify the stepsize there and how to collapse the window from the 3d to a continuous 2d array. Also this Rolling or sliding window iterator? but that's in Python and I'm not sure how efficient that is. Also, it supports elements but does not join them together in the end if each element has multiple features.

like image 567
siamii Avatar asked Mar 30 '13 19:03

siamii


People also ask

Is a NumPy Ndarray is faster than a built in list?

Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.

What is the difference between NumPy array and Ndarray?

numpy. array is just a convenience function to create an ndarray ; it is not a class itself. You can also create an array using numpy. ndarray , but it is not the recommended way.

How do you get the shape of NP Ndarray?

shape to get the dimensions of a NumPy array. Use the numpy. ndarray. shape attribute to get an array's dimensions as a tuple, where the first item is the number of rows and the second item is the number of columns.

What is __ Array_interface __?

__array_interface__ A dictionary of items (3 required and 5 optional). The optional keys in the dictionary have implied defaults if they are not provided. The keys are: shape (required) Tuple whose elements are the array size in each dimension.


2 Answers

You can do a vectorized sliding window in numpy using fancy indexing.

>>> import numpy as np  >>> a = np.array([[00,01], [10,11], [20,21], [30,31], [40,41], [50,51]])  >>> a array([[ 0,  1],        [10, 11],        [20, 21],                      #define our 2d numpy array        [30, 31],        [40, 41],        [50, 51]])  >>> a = a.flatten()  >>> a array([ 0,  1, 10, 11, 20, 21, 30, 31, 40, 41, 50, 51])    #flattened numpy array  >>> indexer = np.arange(6)[None, :] + 2*np.arange(4)[:, None]  >>> indexer array([[ 0,  1,  2,  3,  4,  5],        [ 2,  3,  4,  5,  6,  7],            #sliding window indices        [ 4,  5,  6,  7,  8,  9],        [ 6,  7,  8,  9, 10, 11]])  >>> a[indexer] array([[ 0,  1, 10, 11, 20, 21],        [10, 11, 20, 21, 30, 31],            #values of a over sliding window        [20, 21, 30, 31, 40, 41],        [30, 31, 40, 41, 50, 51]])  >>> np.sum(a[indexer], axis=1) array([ 63, 123, 183, 243])         #sum of values in 'a' under the sliding window. 

Explanation for what this code is doing.

The np.arange(6)[None, :] creates a row vector 0 through 6, and np.arange(4)[:, None] creates a column vector 0 through 4. This results in a 4x6 matrix where each row (six of them) represents a window, and the number of rows (four of them) represents the number of windows. The multiple of 2 makes the sliding window slide 2 units at a time which is necessary for sliding over each tuple. Using numpy array slicing you can pass the sliding window into the flattened numpy array and do aggregates on them like sum.

like image 56
user42541 Avatar answered Sep 21 '22 18:09

user42541


In [1]: import numpy as np  In [2]: a = np.array([[00,01], [10,11], [20,21], [30,31], [40,41], [50,51]])  In [3]: w = np.hstack((a[:-2],a[1:-1],a[2:]))  In [4]: w Out[4]:  array([[ 0,  1, 10, 11, 20, 21],        [10, 11, 20, 21, 30, 31],        [20, 21, 30, 31, 40, 41],        [30, 31, 40, 41, 50, 51]]) 

You could write this in as a function as so:

def window_stack(a, stepsize=1, width=3):     n = a.shape[0]     return np.hstack( a[i:1+n+i-width:stepsize] for i in range(0,width) ) 

This doesn't really depend on the shape of the original array, as long as a.ndim = 2. Note that I never use either lengths in the interactive version. The second dimension of the shape is irrelevant; each row can be as long as you want. Thanks to @Jaime's suggestion, you can do it without checking the shape at all:

def window_stack(a, stepsize=1, width=3):     return np.hstack( a[i:1+i-width or None:stepsize] for i in range(0,width) ) 
like image 31
askewchan Avatar answered Sep 19 '22 18:09

askewchan