Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flag similarities between dataframes in new column

I would like to compare two pandas DataFrames of different length and identify matching index numbers. When values match I would like to flag those values in a new column.

df1:
Index    Column 1
41660    Apple
41935    Banana
42100    Strawberry
42599    Pineapple

df2:
Index    Column 1
42599    Pineapple

Output:
Index    Column 1    'Matching Index?'
41660    Apple
41935    Banana
42100    Strawberry
42599    Pineapple   True
like image 683
zbug Avatar asked Mar 09 '26 21:03

zbug


2 Answers

If these are really the index then you can use intersection on the indices:

In [61]:
df1.loc[df1.index.intersection(df2.index), 'flag'] = True
df1

Out[61]:
         Column 1  flag
Index                  
41660       Apple   NaN
41935      Banana   NaN
42100  Strawberry   NaN
42599   Pineapple  True

Otherwise use isin:

In [63]:
df1.loc[df1['Index'].isin(df2['Index']), 'flag'] = True
df1

Out[63]:
   Index    Column 1  flag
0  41660       Apple   NaN
1  41935      Banana   NaN
2  42100  Strawberry   NaN
3  42599   Pineapple  True
like image 91
EdChum Avatar answered Mar 11 '26 10:03

EdChum


+1 to @EdChum's answer. If you can live with a different value than True in your matching column try:

>>> df1.merge(df2,how='outer',indicator='Flag')
   Index      Column       Flag
0  41660       Apple  left_only
1  41935      Banana  left_only
2  42100  Strawberry  left_only
3  42599   Pineapple       both
like image 38
mechanical_meat Avatar answered Mar 11 '26 10:03

mechanical_meat



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!