Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

getting ranges of sequences of identical entries with minimum length in a numpy array

Consider an array with entries consisting exclusively of -1 or 1. How do I get the ranges of all slices containing 1 exclusively and being of minimum length t (e.g. t=3)

Example:

>>>a=np.array([-1,-1,1,1,1,1,1,-1,1,-1,-1,1,1,1,1], dtype=int)
>>> a
array([-1, -1,  1,  1,  1,  1,  1, -1,  1, -1, -1,  1,  1,  1,  1])

Then, desired output fort=3 would be [(2,7),(11,15)].

like image 908
corinna Avatar asked Oct 20 '22 00:10

corinna


1 Answers

One approach using np.diff and np.where -

# Append with `-1s` at either ends and get the differentiation
dfa = np.diff(np.hstack((-1,a,-1)))

# Get the positions of starts and stops of 1s in `a`
starts = np.where(dfa==2)[0]
stops = np.where(dfa==-2)[0]

# Get valid mask for pairs from starts and stops being of at least 3 in length
valid_mask = (stops - starts) >= 3

# Finally collect the valid pairs as the output
out = np.column_stack((starts,stops))[valid_mask].tolist()
like image 92
Divakar Avatar answered Oct 22 '22 10:10

Divakar