Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Column value will increment and reset for changes in another two columns

I have dataframe where I want to keep on increasing the value until val changes and when id changes reset the count value

data = [['p1',1],
        ['p1',1],
        ['p1',2],
        ['p2',3],
        ['p2',5],
        ['p3',1],
        ['p3',2],
        ['p3',1]]

df = pd.DataFrame(data = data,columns = ['id','val'])

Desired output

       id val  count
    0  p1   1      1
    1  p1   1      1
    2  p1   2      2
    3  p2   3      1
    4  p2   5      2
    5  p3   1      1
    6  p3   2      2
    7  p3   1      3

I have get till here

df['count'] = (df.val.diff() != 0).cumsum()

This only change when val column changes but doesn't reset when id column changes

like image 966
Ajay Chinni Avatar asked Dec 03 '22 17:12

Ajay Chinni


1 Answers

you can try groupby+transform with a lambda

df['count'] = df.groupby("id")['val'].transform(lambda x: x.ne(x.shift()).cumsum())

print(df)

   id  val  count
0  p1    1      1
1  p1    1      1
2  p1    2      2
3  p2    3      1
4  p2    5      2
5  p3    1      1
6  p3    2      2
7  p3    1      3
like image 74
anky Avatar answered Jan 19 '23 00:01

anky