Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grouping pandas series based on condition

I have a Pandas df with one column the following values.

      Data
0      A
1      A 
2      B
3      A
4      A 
5      A
6      B
7      A
8      A
9      B

I want to try and group these values as such, for each encounter of Value B, i want the the group value to be changed as follows

      Data  Group
0      A      1
1      A      1
2      B      1
3      A      2
4      A      2
5      A      2
6      B      2
7      A      3
8      A      3
9      B      3

How can this be achieved using pandas inbuilt. in some way to create any helper columns to facilitate the mentioned task.

like image 760
Inderjeet Singh Avatar asked Jan 01 '23 04:01

Inderjeet Singh


2 Answers

You can try cumsum after comparing if the series equals B and then shift 1 place to include B in the group:

df['Data'].eq('B').shift(fill_value=False).cumsum().add(1)

0    1
1    1
2    1
3    2
4    2
5    2
6    2
7    3
8    3
9    3
like image 92
anky Avatar answered Jan 13 '23 12:01

anky


I notice the group here is descending. But if you only need to split the group by Data, the output should be same:

s=df.Data.eq('B').iloc[::-1].cumsum()
s
9    1
8    1
7    1
6    2
5    2
4    2
3    2
2    3
1    3
0    3
Name: Data, dtype: int64
like image 27
BENY Avatar answered Jan 13 '23 13:01

BENY