Current df
ID col1 col2
1 A 1
1 A 2
1 B 3
1 B 4
1 C 5
1 D 6
I need to create col3 which will either be 1 or 2 depending on the following conditions: col3=1 unless for same ID AND same col1 there are multiple instances of col2. If there are multiple instances, start at the lowest instance of col2 and assign col3=1, then go to the next instance of col2, if next instance is previous instance+1 then col3=2, else col3=1
Sounds confusing but the desired outcome is
ID col1 col2 col3
1 A 1 1
1 A 2 2
1 B 3 1
1 B 4 2
1 C 5 1
1 D 6 1
EDIT
ID col1 col2 col3
1 A 1 1
1 A 2 2
1 A 3 2
1 B 3 1
1 B 4 2
1 B 17 1
1 C 5 1
1 D 6 1
Added ID=1, col1=A, col2=3 (since col2 is equal to 1+the row above it), col3=2 Added ID=1, col1=B, col2=17 (since col2 is NOT equal to 1+the row above it), col3=1
Let's use cumcount:
df.assign(col3 = df.groupby(['ID','col1']).cumcount() + 1)
Output:
ID col1 col2 col3
0 1 A 1 1
1 1 A 2 2
2 1 B 3 1
3 1 B 4 2
4 1 C 5 1
5 1 D 6 1
g = df.groupby(['ID','col1'])['col2'].apply(lambda x: x.diff().fillna(1).ne(1).cumsum())
df.assign(col3a = (df.groupby(['ID','col1',g]).cumcount() + 1).clip(1,2))
Output:
ID col1 col2 col3 col3a
0 1 A 1 1 1
1 1 A 2 2 2
2 1 A 3 2 2
3 1 B 3 1 1
4 1 B 4 2 2
5 1 B 17 1 1
6 1 C 5 1 1
7 1 D 6 1 1
Or maybe
df.groupby('col1').col2.apply(lambda x : (x.diff().eq(1))).map({True:2,False:1})
Out[1148]:
0 1
1 2
2 2
3 1
4 2
5 1
6 1
7 1
Name: col2, dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With