Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge DataFrames only if one column meets a certain condition

Say I have a dataframe 'df_main':

ID   Type
1    Car
1    Truck
2    Truck

And another dataframe 'truck_data':

Truck_ID         Data
1          TruckData1
2          TruckData2

Merging these two gives the dataframe:

   ID   Type  Truck_ID        Data
0   1    Car         1  TruckData1
1   1  Truck         1  TruckData1
2   2  Truck         2  TruckData2

How can I modify the merge statement to only merge in the rows from truck_data that match on ID if and only if the Type == 'Truck'?

Desired output:

   ID   Type  Truck_ID        Data
0   1    Car         1  
1   1  Truck         1  TruckData1
2   2  Truck         2  TruckData2

Current code:

df_main = pd.DataFrame(columns=['ID', 'Type'], data=[[1, 'Car'], [1, 'Truck'], [2, 'Truck']])
truck_df = pd.DataFrame(columns=['Truck_ID', 'Data'], data=[[1, 'TruckData1'], [2, 'TruckData2']])
df_main = df_main.merge(truck_df, left_on='ID', right_on='Truck_ID', how='left')
like image 408
Barry Avatar asked Mar 01 '23 14:03

Barry


1 Answers

Filter the required rows from df1 / perform merge /concat the unfiltered rows.

m = df_main.Type.eq('Truck')
merged_df = pd.concat([df_main.loc[~m], df_main.loc[m].merge(truck_df, left_on='ID', right_on='Truck_ID')]

OUTPUT:

   ID   Type        Data
0   1    Car         NaN
0   1  Truck  TruckData1
1   2  Truck  TruckData2
like image 193
Nk03 Avatar answered May 01 '23 07:05

Nk03