Python/Pandas Dataframe logic on a column with restrictions

Question

Current df

ID col1 col2
1  A  1
1  A  2
1  B  3
1  B  4
1  C  5
1  D  6

I need to create col3 which will either be 1 or 2 depending on the following conditions: col3=1 unless for same ID AND same col1 there are multiple instances of col2. If there are multiple instances, start at the lowest instance of col2 and assign col3=1, then go to the next instance of col2, if next instance is previous instance+1 then col3=2, else col3=1

Sounds confusing but the desired outcome is

ID col1 col2 col3
1  A    1    1
1  A    2    2
1  B    3    1
1  B    4    2
1  C    5    1
1  D    6    1

EDIT

ID col1 col2 col3
1  A    1    1
1  A    2    2
1  A    3    2
1  B    3    1
1  B    4    2
1  B    17   1
1  C    5    1
1  D    6    1

Added ID=1, col1=A, col2=3 (since col2 is equal to 1+the row above it), col3=2 Added ID=1, col1=B, col2=17 (since col2 is NOT equal to 1+the row above it), col3=1

Scott Boston · Accepted Answer

Let's use cumcount:

df.assign(col3 = df.groupby(['ID','col1']).cumcount() + 1)

Output:

   ID col1  col2  col3
0   1    A     1     1
1   1    A     2     2
2   1    B     3     1
3   1    B     4     2
4   1    C     5     1
5   1    D     6     1

EDIT to handle "clipping" use df.clip for comparison create new column 'col3a':

g = df.groupby(['ID','col1'])['col2'].apply(lambda x: x.diff().fillna(1).ne(1).cumsum())
df.assign(col3a = (df.groupby(['ID','col1',g]).cumcount() + 1).clip(1,2))

Output:

   ID col1  col2  col3  col3a
0   1    A     1     1      1
1   1    A     2     2      2
2   1    A     3     2      2
3   1    B     3     1      1
4   1    B     4     2      2
5   1    B    17     1      1
6   1    C     5     1      1
7   1    D     6     1      1

BENY · Answer

Or maybe

df.groupby('col1').col2.apply(lambda x : (x.diff().eq(1))).map({True:2,False:1})
Out[1148]: 
0    1
1    2
2    2
3    1
4    2
5    1
6    1
7    1
Name: col2, dtype: int64

Python/Pandas Dataframe logic on a column with restrictions

Tags:

python

pandas

numpy

babz

2 Answers

EDIT to handle "clipping" use df.clip for comparison create new column 'col3a':

Scott Boston

BENY

Recent Activity

Donate For Us

Python/Pandas Dataframe logic on a column with restrictions

Tags:

python

pandas

numpy

babz

2 Answers

EDIT to handle "clipping" use df.clip for comparison create new column 'col3a':

Scott Boston

BENY

Related questions

Recent Activity

Donate For Us