I have a boolean (numpy) array. And I want to count how many occurrences of 'True' are between the Falses.
Eg for a sample list:
b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F]
should produce
ml = [3,3,1]
my initial attempt was to try this snippet:
i = 0
ml = []
for el in b_List:
if (b_List):
i += 1
ml.append(i)
i = 0
But it keeps appending elements in ml for each F in the b_List.
EDIT
Thank you all for your answers. Sadly I can' accept all the answers as correct. I've accepted Akavall's answer because he referred to my initial attempt (I know what I did wrong now) and also did a comparison between the Mark's and Ashwinis posts.
Please don't take as a define answer the accepted solution, since both the other suggestions introduce alternative methods what work equally well
itertools.groupby provides one easy way to do this:
>>> import itertools
>>> T, F = True, False
>>> b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F]
>>> [len(list(group)) for value, group in itertools.groupby(b_List) if value]
[3, 3, 1]
Using NumPy
:
>>> import numpy as np
>>> a = np.array([ True, True, True, False, False, False, False, True, True, True, False, False, True, False], dtype=bool)
>>> np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
array([3, 3, 1])
>>> a = np.array([True, False, False, True, True, False, False, True, False])
>>> np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
array([1, 2, 1])
Can't say that this is the best NumPy solution, but it is still faster than itertools.groupby
:
>>> lis = [ True, True, True, False, False, False, False, True, True, True, False, False, True, False]*1000
>>> a = np.array(lis)
>>> %timeit [len(list(group)) for value, group in groupby(lis) if value]
100 loops, best of 3: 9.58 ms per loop
>>> %timeit np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
1000 loops, best of 3: 1.4 ms per loop
>>> lis = [ True, True, True, False, False, False, False, True, True, True, False, False, True, False]*10000
>>> a = np.array(lis)
>>> %timeit [len(list(group)) for value, group in groupby(lis) if value]
1 loops, best of 3: 95.5 ms per loop
>>> %timeit np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
100 loops, best of 3: 14.9 ms per loop
As @justhalf and @Mark Dickinson pointed out in comments the above code will not work in some cases, so you need to append False
on both ends first:
In [28]: a
Out[28]:
array([ True, True, True, False, False, False, False, True, True,
True, False, False, True, False], dtype=bool)
In [29]: np.diff(np.where(np.diff(np.hstack([False, a, False])))[0])[::2]
Out[29]: array([3, 3, 1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With