Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - drop_duplicates with multiple conditions

I have a dataset where I want to remove duplicates based on some conditions.

For example, say I have a table as

ID  date    group
3001    2010    DCM
3001    2012    NII
3001    2012    DCM

I wanna say look into ID column for the similar IDs, if two dates were similar keep the row that group is NII

so it would become

ID  date    group
3001    2010    DCM
3001    2012    NII
like image 315
Soyol Avatar asked Nov 30 '22 08:11

Soyol


1 Answers

Leverage duplicated here:

df[~df.duplicated(['ID', 'date'], keep=False) | df['group'].eq('NII')]

     ID  date group
0  3001  2010   DCM
1  3001  2012   NII
like image 62
cs95 Avatar answered Dec 25 '22 05:12

cs95