Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy Array: First occurence of N consecutive values smaller than threshold

Tags:

python

numpy

I have a one-dimensional numpy array - for example,

a = np.array([1, 4, 5, 7, 1, 2, 2, 4, 10])

I would like to obtain the index of the first number for which the N subsequent values are below a certain value x.

In this case, for N=3 and x=3, I would search for the first number for which the three entries following it were all less than three. This would be a[4].

This can easily be implemented by simply iterating through all the values via a for loop, but I was wondering if there any cleaner and more efficient ways of accomplishing this.

like image 946
Henry Shackleton Avatar asked Aug 29 '19 14:08

Henry Shackleton


People also ask

What does [: :] mean on Numpy arrays?

The [:, :] stands for everything from the beginning to the end just like for lists. The difference is that the first : stands for first and the second : for the second dimension. a = numpy. zeros((3, 3)) In [132]: a Out[132]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])

What does [: None mean in Numpy?

None is an alias for NP. newaxis. It creates an axis with length 1.

How do I get Argmax in Python?

Use np. arange() function to create an array and then use np argmax() function. Let's use the numpy arange() function to create a two-dimensional array and find the index of the maximum value of the array.


1 Answers

Approach #1 :

Here's a vectorized NumPy way -

def start_valid_island(a, thresh, window_size):
    m = a<thresh
    me = np.r_[False,m,False]
    idx = np.flatnonzero(me[:-1]!=me[1:])
    lens = idx[1::2]-idx[::2]
    return idx[::2][(lens >= window_size).argmax()]

Sample runs -

In [44]: a
Out[44]: array([ 1,  4,  5,  7,  1,  2,  2,  4, 10])

In [45]: start_valid_island(a, thresh=3, window_size=3)
Out[45]: 4

In [46]: a[:3] = 1

In [47]: start_valid_island(a, thresh=3, window_size=3)
Out[47]: 0

Approach #2 :

With SciPy's binary-erosion -

from scipy.ndimage.morphology import binary_erosion

def start_valid_island_v2(a, thresh, window_size):
    m = a<thresh
    k = np.ones(window_size,dtype=bool)
    return binary_erosion(m,k,origin=-(window_size//2)).argmax()

Approach #3 :

To complete the set, here's a loopy one based on short-ciruiting and using the efficiency of numba -

from numba import njit

@njit
def start_valid_island_v3(a, thresh, window_size):
    n = len(a)
    out = None
    for i in range(n-window_size+1):
        found = True
        for j in range(window_size):
            if a[i+j]>=thresh:
                found = False
                break
        if found:
            out = i
            break
    return out

Timings -

In [142]: np.random.seed(0)
     ...: a = np.random.randint(0,10,(100000000))

In [145]: %timeit start_valid_island(a, thresh=3, window_size=3)
1 loop, best of 3: 810 ms per loop

In [146]: %timeit start_valid_island_v2(a, thresh=3, window_size=3)
1 loop, best of 3: 1.27 s per loop

In [147]: %timeit start_valid_island_v3(a, thresh=3, window_size=3)
1000000 loops, best of 3: 608 ns per loop
like image 57
Divakar Avatar answered Oct 10 '22 07:10

Divakar