My dataframe looks like
ID colA
1 B
1 D
2 B
2 D
2 C
I have return all rows after the last occurrence of event B in each group. The output will be :
ID colA
1 D
2 D
2 C
I tried
a = df['colA'].str.contains('B').groupby(df['ID'])
b = df[(a.transform('sum') - a.cumsum()).eq(0)]
and it's working fine so far. I am just wondering if there is any alternative approach to achieve this?
Reverse your rows (this is important). Then call groupby
and cumsum
, and take all rows with (reversed) cumsum value equal to zero.
df[df.colA.eq('B')[::-1].astype(int).groupby(df.ID).cumsum().eq(0)]
ID colA
1 1 D
3 2 D
4 2 C
You could do:
ix = (df.colA.eq('B')
.cumsum()
.groupby(df.ID)
.apply(lambda x: x.loc[x.idxmax()+1:]).index.get_level_values(1))
df.loc[ix,:]
ID colA
1 1 D
3 2 D
4 2 C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With