Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the difference between dataframes according to specific columns pandas

I have multiple big dataframes and I need to get the difference between them, according to specific columns, while keeping the data of the other columns. for example, if I have 2 dataframes as below.

df1:

   num    Ref   Alt    DP     NL
0  300     C     A    50.0   30.0  
1  500     C     A    45.0   90.0 
2  255     A     C    69.0   26.0
3  450     B     B    55.0   25.0

df2:

   num    Ref   Alt    DP     NL
0  300     C     A    71.0   88.0  
1  500     A     A    45.0   54.0   
2  255     A     A    55.0   55.0
3  450     B     B    70.0   15.0 

I'd like to get the difference between those dataframes according to specific columns ['num','Ref','Alt'] while keeping the data form the remaining columns ['DP', 'NL']. In the example, rows 0 and 3 are the same across the 2 dataframes (since ['num','Ref','Alt'] are the same), so the output dataframes should look something like that:

df1:

   num    Ref   Alt     DP     NL
0  500     C     A     45.0   90.0   
2  255     A     C     69.0   26.0

df2:

   num    Ref   Alt    DP     NL
1  500     A     A    45.0   54.0   
2  255     A     A    55.0   55.0

or get it in the same dataframe, if possible...

I've tried to think of few solutions,one is concatenating them and then remove the duplicates, but beacuse they are bif dataframes, it might cause problems,or looping through the dataframes and use the below script:

temp_df = df_list[0]

for df in df_list[1:]:

    df_filtered_for_diff = pd.merge(temp_df, df, on=['num', 'Ref', 'Alt'], how='outer')                                                

    temp_df = df_filtered_for_diff

return (temp_df)

but I was wondering if there are better solutions?

like image 967
Bella Avatar asked Jun 02 '26 03:06

Bella


1 Answers

You can concatenate, then drop duplicates with keep=False:

df = pd.concat([df1, df2])\
       .drop_duplicates(subset=['num', 'Ref', 'Alt'], keep=False)

print(df)

   num Ref Alt    DP    NL
1  500   C   A  45.0  90.0
2  255   A   C  69.0  26.0
1  500   A   A  45.0  54.0
2  255   A   A  55.0  55.0
like image 129
jpp Avatar answered Jun 05 '26 02:06

jpp



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!