I am currently merging two dataframes with an outer join. However, after merging, I see all the rows are duplicated even when the columns that I merged upon contain the same values.
Specifically, I have the following code.
merged_df = pd.merge(df1, df2, on=['email_address'], how='inner')
Here are the two dataframes and the results.
df1
email_address name surname 0 [email protected] john smith 1 [email protected] john smith 2 [email protected] elvis presley
df2
email_address street city 0 [email protected] street1 NY 1 [email protected] street1 NY 2 [email protected] street2 LA
merged_df
email_address name surname street city 0 [email protected] john smith street1 NY 1 [email protected] john smith street1 NY 2 [email protected] john smith street1 NY 3 [email protected] john smith street1 NY 4 [email protected] elvis presley street2 LA 5 [email protected] elvis presley street2 LA
My question is, shouldn't it be like this?
This is how I would like my merged_df
to be like.
email_address name surname street city 0 [email protected] john smith street1 NY 1 [email protected] john smith street1 NY 2 [email protected] elvis presley street2 LA
Are there any ways I can achieve this?
To concatenate DataFrames, use the concat() method, but to ignore duplicates, use the drop_duplicates() method.
Use DataFrame. drop_duplicates() to Drop Duplicate and Keep First Rows. You can use DataFrame. drop_duplicates() without any arguments to drop rows with the same values on all columns.
merge() function to join the two data frames by inner join. Now, add a suffix called 'remove' for newly joined columns that have the same name in both data frames. Use the drop() function to remove the columns with the suffix 'remove'. This will ensure that identical columns don't exist in the new dataframe.
list_2_nodups = list_2.drop_duplicates() pd.merge(list_1 , list_2_nodups , on=['email_address'])
The duplicate rows are expected. Each john smith in list_1
matches with each john smith in list_2
. I had to drop the duplicates in one of the lists. I chose list_2
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With