I have a pandas series of Boolean values, and I would like to label contiguous groups of True values. How is it possible to do this? Is it possible to do this in a vectorised manner? Any help would be hugely appreciated!
Data:
A
0 False
1 True
2 True
3 True
4 False
5 False
6 True
7 False
8 False
9 True
10 True
Desired:
A Label
0 False 0
1 True 1
2 True 1
3 True 1
4 False 0
5 False 0
6 True 2
7 False 0
8 False 0
9 True 3
10 True 3
Here's a unlikely but simple and working solution:
import scipy.ndimage.measurements as mnts
labeled, clusters = mnts.label(df.A.values)
# labeled is what you want, cluster is the number of clusters.
df.Labels = labeled # puts it into df
Tested as:
a = array([False, False, True, True, True, False, True, False, False,
True, False, True, True, True, True, True, True, True,
False, True], dtype=bool)
labeled, clusters = mnts.label(a)
>>> labeled
array([0, 0, 1, 1, 1, 0, 2, 0, 0, 3, 0, 4, 4, 4, 4, 4, 4, 4, 0, 5], dtype=int32)
>>> clusters
5
cumsum
a = df.A.values
z = np.zeros(a.shape, int)
z[a] = pd.factorize((~a).cumsum()[a])[0] + 1
df.assign(Label=z)
A Label
0 False 0
1 True 1
2 True 1
3 True 1
4 False 0
5 False 0
6 True 2
7 False 0
8 False 0
9 True 3
10 True 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With