Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Threshold numpy array, find windows

Input data is a 2D array (timestamp, value) pairs, ordered by timestamp:

np.array([[50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66],
          [ 2,  3,  5,  6,  4,  2,  1,  2,  3,  4,  5,  4,  3,  2,  1,  2,  3]])

I want to find time windows where the value exceeds a threshold (eg. >=4). Seems I can do the threshold part with a boolean condition, and map back to the timestamps with np.extract():

>>> a[1] >= 4
array([False, False,  True,  True,  True, False, False, False, False,
        True,  True,  True, False, False, False, False, False])

>>> np.extract(a[1] >= 4, a[0])
array([52, 53, 54, 59, 60, 61])

But from that I need the first and last timestamps of each window matching the threshold (ie. [[52, 54], [59, 61]]), which is where I can't quite find the right approach.

like image 406
rcoup Avatar asked Jan 08 '19 10:01

rcoup


2 Answers

When you have array([52, 53, 54, 59, 60, 61]) you can use numpy.split following way

a = np.array([52,53,54,59,60,61])
b = list(a)
indices = [inx for inx,j in enumerate([i[1]-i[0] for i in zip(b,b[1:])]) if j>1]
suba = np.split(a,indices)
print(suba) #prints [array([52, 53]), array([54, 59, 60, 61])]

Note that you should feed starting points as 2nd argument to numpy.split - in this examples indices is [2] (list with one value)

like image 42
Daweo Avatar answered Sep 22 '22 00:09

Daweo


Here's one way:

# Create a mask
In [42]: mask = (a[1] >= 4)
# find indice of start and end of the threshold 
In [43]: ind = np.where(np.diff(mask))[0]
# add 1 to starting indices
In [44]: ind[::2] += 1
# find and reshape the result
In [45]: result = a[0][ind].reshape(-1, 2)

In [46]: result
Out[46]: 
array([[52, 54],
       [59, 61]])
like image 185
Mazdak Avatar answered Sep 21 '22 00:09

Mazdak