I have been trying to merge two DataFrames together (df and df_details) in a similar fashion to an Excel "vlookup" but am getting strange results. Below I show the structure of the two DataFrames without populating real data for simplicity
df_details:
Abstract_Title | Abstract_URL | Session_No_v2 | Session_URL | Session_ID
-------------------------------------------------------------------------
Abstract_Title1 Abstract_URL1 1 Session_URL1 12345
Abstract_Title2 Abstract_URL2 1 Session_URL1 12345
Abstract_Title3 Abstract_URL3 1 Session_URL1 12345
Abstract_Title4 Abstract_URL4 2 Session_URL2 22222
Abstract_Title5 Abstract_URL5 2 Session_URL2 22222
Abstract_Title6 Abstract_URL6 3 Session_URL3 98765
Abstract_Title7 Abstract_URL7 3 Session_URL3 98765
df:
Session_Title | Session_URL | Sponsors | Type | Session_ID
-------------------------------------------------------------------------------
Session_Title1 Session_URL1 x, y z Paper 12345
Session_Title2 Session_URL2 x, y Presentation 22222
Session_Title3 Session_URL3 a, b ,c Presentation 98765
Session_Title4 Session_URL4 c Talk 12121
Session_Title5 Session_URL5 a, x Paper 33333
I want to merge along Session_ID and I want the final DataFrame to look like:

I've tried the following script which yields a DataFrame that duplicates (several times) certain rows and does strange things. For example, df_details has 7,046 rows and df has 1,856 rows - when I run the following merge code, my final_df results in 21,148 rows:
final_df = pd.merge(df_details, df, how = 'outer', on = 'Session_ID')
Please help!
To generate your final output table I used the following code:
final_df = pd.merge(df_details, df[['Session_ID',
'Session_Title',
'Sponsors',
'Type']], left_on = ['Session_ID'], right_on = ['Session_ID'], how = 'outer')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With