Compare 2 consecutive rows and assign increasing value if different (using Pandas)

Question

I have a dataframe df_in like so:

import pandas as pd
dic_in = {'A':['aa','aa','bb','cc','cc','cc','cc','dd','dd','dd','ee'],
       'B':['200','200','200','400','400','500','700','700','900','900','200'],
       'C':['da','cs','fr','fs','se','at','yu','j5','31','ds','sz']}
df_in = pd.DataFrame(dic_in)

I would like to investigate the 2 columns A and B in the following way. I 2 consecutive rows[['A','B']] are equal then they are assigned a new value (according to a specific rule which i am about to describe). I will give an example to be more clear: If the first row[['A','B']] is equal to the following one, then I set 1; if the second one is equal to the third one then I will set 1. Every time two consecutive rows are different, then I increase the value to set by 1.

The result should look like this:

     A    B   C  value
0   aa  200  da      1
1   aa  200  cs      1
2   bb  200  fr      2
3   cc  400  fs      3
4   cc  400  se      3
5   cc  500  at      4
6   cc  700  yu      5
7   dd  700  j5      6
8   dd  900  31      7
9   dd  900  ds      7
10  ee  200  sz      8

Can you suggest me a smart one to achieve this goal?

root · Accepted Answer

Use shift and any to compare consecutive rows, using True to indicate where the value should change. Then take the cumulative sum with cumsum to get the increasing value:

df_in['value'] = (df_in[['A', 'B']] != df_in[['A', 'B']].shift()).any(axis=1)
df_in['value'] = df_in['value'].cumsum()

Alternatively, condensing it to one line:

df_in['value'] = (df_in[['A', 'B']] != df_in[['A', 'B']].shift()).any(axis=1).cumsum()

The resulting output:

     A    B   C  value
0   aa  200  da      1
1   aa  200  cs      1
2   bb  200  fr      2
3   cc  400  fs      3
4   cc  400  se      3
5   cc  500  at      4
6   cc  700  yu      5
7   dd  700  j5      6
8   dd  900  31      7
9   dd  900  ds      7
10  ee  200  sz      8

Compare 2 consecutive rows and assign increasing value if different (using Pandas)

Tags:

python

replace

pandas

dataframe

compare

Federico Gentile

1 Answers

root

Recent Activity

Donate For Us

Compare 2 consecutive rows and assign increasing value if different (using Pandas)

Tags:

python

replace

pandas

dataframe

compare

Federico Gentile

1 Answers

root

Related questions

Recent Activity

Donate For Us