Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy, grouping every N continuous element?

I would like to extract groups of every N continuous elements from an array. For a numpy array like this:

a = numpy.array([1,2,3,4,5,6,7,8])

I wish to have (N=5):

array([[1,2,3,4,5],
       [2,3,4,5,6],
       [3,4,5,6,7],
       [4,5,6,7,8]])

so that I can run further functions such as average and sum. How do I produce such an array?

like image 910
He Shiming Avatar asked Apr 26 '15 09:04

He Shiming


2 Answers

One approach with broadcasting -

import numpy as np
out = a[np.arange(a.size - N + 1)[:,None] + np.arange(N)]

Sample run -

In [31]: a
Out[31]: array([4, 2, 5, 4, 1, 6, 7, 3])

In [32]: N
Out[32]: 5

In [33]: out
Out[33]: 
array([[4, 2, 5, 4, 1],
       [2, 5, 4, 1, 6],
       [5, 4, 1, 6, 7],
       [4, 1, 6, 7, 3]])
like image 125
Divakar Avatar answered Nov 09 '22 19:11

Divakar


You could use rolling_window from this blog

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

In [37]: a = np.array([1,2,3,4,5,6,7,8])

In [38]: rolling_window(a, 5)
Out[38]:
array([[1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]])

I liked @Divkar's solution. However, for larger arrays and windows, you may want to use rolling_window?

In [55]: a = np.arange(1000)

In [56]: %timeit rolling_window(a, 5)
100000 loops, best of 3: 9.02 µs per loop

In [57]: %timeit broadcast_f(a, 5)
10000 loops, best of 3: 87.7 µs per loop

In [58]: %timeit rolling_window(a, 100)
100000 loops, best of 3: 8.93 µs per loop

In [59]: %timeit broadcast_f(a, 100)
1000 loops, best of 3: 1.04 ms per loop
like image 27
Zero Avatar answered Nov 09 '22 20:11

Zero