Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare Multiple Columns to Get Rows that are Different in Two Pandas Dataframes

Tags:

python

pandas

I have two dataframes:

df1=
    A    B   C
0   A0   B0  C0
1   A1   B1  C1
2   A2   B2  C2

df2=
    A    B   C
0   A2   B2  C10
1   A1   B3  C11
2   A9   B4  C12

and I want to find rows in df1 that are not found in df2 based on one or two columns (or more columns). So, if I only compare column 'A' then the following rows from df1 are not found in df2 (note that column 'B' and column 'C' are not used for comparison between df1 and df2)

    A    B   C
0   A0   B0  C0

And I would like to return a series with

0   False
1   True
2   True

Or, if I only compare column 'A' and column 'B' then the following rows from df1 are not found in df2 (note that column 'C' is not used for comparison between df1 and df2)

    A    B   C
0   A0   B0  C0
1   A1   B1  C1

And I would want to return a series with

0   False
1   False
2   True

I know how to accomplish this using sets but I am looking for a straightforward Pandas way of accomplishing this.

like image 479
slaw Avatar asked Jan 26 '26 10:01

slaw


1 Answers

If your version is 0.17.0 then you can use pd.merge and pass the cols of interest, how='left' and set indicator=True to whether the values are only present in left or both. You can then test whether the appended _merge col is equal to 'both':

In [102]:
pd.merge(df1, df2, on='A',how='left', indicator=True)['_merge'] == 'both'

Out[102]:
0    False
1     True
2     True
Name: _merge, dtype: bool

In [103]:
pd.merge(df1, df2, on=['A', 'B'],how='left', indicator=True)['_merge'] == 'both'

Out[103]:
0    False
1    False
2     True
Name: _merge, dtype: bool

output from the merge:

In [104]:
pd.merge(df1, df2, on='A',how='left', indicator=True)

Out[104]:
    A B_x C_x  B_y  C_y     _merge
0  A0  B0  C0  NaN  NaN  left_only
1  A1  B1  C1   B3  C11       both
2  A2  B2  C2   B2  C10       both

In [105]:    
pd.merge(df1, df2, on=['A', 'B'],how='left', indicator=True)

Out[105]:
    A   B C_x  C_y     _merge
0  A0  B0  C0  NaN  left_only
1  A1  B1  C1  NaN  left_only
2  A2  B2  C2  C10       both
like image 160
EdChum Avatar answered Jan 27 '26 23:01

EdChum



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!