find groups of neighboring True in pandas series

Tags:

python

pandas

I have a series with True and False and need to find all groups of True. This means that I need to find the start index and end index of neighboring Truevalues.

The following code gives the intended result but is very slow, inefficient and clumsy.

import pandas as pd

def groups(ser):
    g = []

    flag = False
    start = None
    for idx, s in ser.items():
        if flag and not s:
            g.append((start, idx-1))
            flag = False
        elif not flag and s:
            start = idx
            flag = True
    if flag:
        g.append((start, idx))
    return g

if __name__ == "__main__":
    ser = pd.Series([1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1], dtype=bool)
    print(ser)

    g = groups(ser)
    print("\ngroups of True:")
    for start, end in g:
        print("from {} until {}".format(start, end))
    pass

output is:

0      True
1      True
2     False
3     False
4      True
5     False
6     False
7      True
8      True
9      True
10     True
11    False
12     True
13    False
14     True

groups of True:
from 0 until 1
from 4 until 4
from 7 until 10
from 12 until 12
from 14 until 14

There are similar questions out there but non is looking to find the indices of the group starts/ends.

Label contiguous groups of True elements within a pandas Series
Streaks of True or False in pandas Series

567

asked Mar 04 '21 14:03

user7431005

2 Answers

It's common to use cumsum on the negation to check for consecutive blocks. For example:

for _,x in s[s].groupby((1-s).cumsum()):
    print(f'from {x.index[0]} to {x.index[-1]}')

Output:

from 0 to 1
from 4 to 4
from 7 to 10
from 12 to 12
from 14 to 14

answered Oct 19 '22 20:10

Quang Hoang

You can use itertools:

In [478]: from operator import itemgetter
     ...: from itertools import groupby

In [489]: a = ser[ser].index.tolist() # Create a list of indexes having `True` in `ser` 

In [498]: for k, g in groupby(enumerate(a), lambda ix : ix[0] - ix[1]):
     ...:     l = list(map(itemgetter(1), g))
     ...:     print(f'from {l[0]} to {l[-1]}')
     ...: 
from 0 to 1
from 4 to 4
from 7 to 10
from 12 to 12
from 14 to 14

answered Oct 19 '22 20:10

Mayank Porwal

Related questions
                            
                                Minimum TLS Version in Azure Storage Account
                            
                                Remove groups by condition
                            
                                How to solve npm install errors on Mac
                            
                                AlarmManager doesn't work on MIUI (and who knows where else)
                            
                                Question about format string in scanf function
                            
                                Toggle elements with class using Alpine JS?
                            
                                JavaFX: Layout problem with BorderPane - bug or user error?
                            
                                Assign value with optional question mark
                            
                                How can I draw using pygame, while also drawing with pyopengl?
                            
                                LU-factorization with OpenMP seems to slow, need advice
                            
                                Design system: styles override using TailwindCSS
                            
                                Android CameraX Error retrieving camcorder profile params

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With