Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select Rows with Same Id but different Values in Pandas

I know this is possible in database software but is there any way to do it in Python Pandas?

ID1         ID2      Value
1209345     1203     2
1209345     1204     3 <-----
1209345     1205     4
1209345     1203     2
1209345     1204     7 <-----
1209346     1203     1
1209347     1204     5

I have ID1 and, corresponding to that, I have multiple ID2s mapped with a value. I need to find all entries where ID1 and ID2 are matching but the values are different.

My current code counts the number of unique combinations of ID1 and ID2, but does not account for unique Value for each combination:

print(df.groupby(['ID1', 'ID2']).size())

ID1      ID2 
1209345  1203    2
         1204    2
         1205    1
1209346  1203    1
1209347  1204    1
dtype: int64

Note: This question is posted for @RohitGirdhar who deleted his original question. The solution I post is not necessarily the only or best one; other answers are encouraged.

like image 345
jpp Avatar asked Dec 18 '22 23:12

jpp


1 Answers

You can filter with nunique and transform:

df = df[df.groupby(['ID1', 'ID2'])['Value'].transform('nunique') > 1]

print (df)
       ID1   ID2  Value
1  1209345  1204      3
4  1209345  1204      7
like image 93
jezrael Avatar answered Jan 08 '23 07:01

jezrael