I have a dataframe,df1 that looks something like this:
Name Event Factor1
John A 2
John B 3
Ken A 1.5
....
and an additional dataframe,df2 like this:
Name Event Factor2
John A 1.2
John B .5
Ken A 2
I would like to join both of these dataframes on the two columns Name and Event, with the resulting columns factor 1 and 2 multiplied by each other.
Name Event FactorResult
John A 2.4
John B 1.5
Ken A 3
What would be the best way to do this? I am unsure on how to join these on two columns. I know I can join and then multiply the two columns, but I'm wondering if there is a better way than merging them first, then multiplying and dropping the unneeded columns?
To merge two pandas DataFrames on multiple columns use pandas. merge() method.
Overview: The mul() method of DataFrame object multiplies the elements of a DataFrame object with another DataFrame object, series or any other Python sequence. mul() does an elementwise multiplication of a DataFrame with another DataFrame, a pandas Series or a Python Sequence.
We can join columns from two Dataframes using the merge() function. This is similar to the SQL 'join' functionality. A detailed discussion of different join types is given in the SQL lesson. You specify the type of join you want using the how parameter.
You could merge and them multiply:
merged = df1.merge(df2, on=['Name', 'Event'])
merged['ResultFactor'] = merged.Factor1 * merged.Factor2
result = merged.drop(['Factor1', 'Factor2'], axis=1)
print(result)
Output
Name Event ResultFactor
0 John A 2.4
1 John B 1.5
2 Ken A 3.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With