I have a numpy
boolean array
w=np.array([True,False,True,True,False,False,False])
I would like to get the index of the first time there are at n_at_least
false values.
For instance here
`n_at_least`=1 -> desired_index=1
`n_at_least`=3 -> desired_index=4
I have tried
np.cumsum(~w)
which does increase every time a False
value is encountered.
However, when True
is encountered the counter is not starting from 0 again so I only get the total count of False
elements rather than the count of the last consecutive ones.
Here's a vectorized solution that finds the start, stop indices and hence lengths of islands of zeros and finally uses argmax
to get the starting index of the first island satisfying the criteria of zeros count being >= n
-
def first_occ_index(w, n):
idx = np.flatnonzero(np.r_[True, w, True])
lens = np.diff(idx) - 1
return idx[(lens >= n).argmax()]
Sample run -
In [107]: w
Out[107]: array([ True, False, True, True, False, False, False])
In [108]: first_occ_index(w, n=1)
Out[108]: 1
In [109]: first_occ_index(w, n=3)
Out[109]: 4
I think you're falling into the numpy trap of only wanting to use numpy functions. What's wrong with python? This solution is O(n)
def f(array, n_at_least):
curr_found_false = 0
curr_index = 0
for index, elem in enumerate(array):
if not elem:
if curr_found_false == 0:
curr_index = index
curr_found_false += 1
if curr_found_false == n_at_least:
return curr_index
else:
curr_found_false = 0
Outputs
w=np.array([True,False,True,True,False,False,False])
f(w, 1)
# --> 1
f(w, 3)
# --> 4
Here is an O(n) numpy solution:
>>> def first_consec(A, n):
... A = np.r_[True, A, True]
... switch, = np.where(A[:-1]!=A[1:])
... runs = switch[1::2] - switch[::2]
... idx = np.argmax(runs >= n)
... if runs[idx] < n:
... return None
... return switch[2*idx]
...
>>> first_consec(w, 4)
>>> first_consec(w, 3)
4
>>> first_consec(w, 2)
4
>>> first_consec(w, 1)
1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With