I have 2 pandas DataFrame that I want to multiply:
frame_score:
Score1 Score2
0 100 80
1 -150 20
2 -110 70
3 180 99
4 125 20
frame_weights:
Score1 Score2
0 0.6 0.4
I tried:
import pandas as pd
import numpy as np
frame_score = pd.DataFrame({'Score1' : [100, -150, -110, 180, 125],
'Score2' : [80, 20, 70, 99, 20]})
frame_weights = pd.DataFrame({'Score1': [0.6], 'Score2' : [0.4]})
print('frame_score: \n{0}'.format(frame_score))
print('\nframe_weights: \n{0}'.format(frame_weights))
# Each of the following alternatives yields the same results
frame_score_weighted = frame_score.mul(frame_weights, axis=0)
frame_score_weighted = frame_score * frame_weights
frame_score_weighted = frame_score.multiply(frame_weights, axis=1)
print('\nframe_score_weighted: \n{0}'.format(frame_score_weighted))
returns:
frame_score_weighted:
Score1 Score2
0 60.0 32.0
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
The rows 1 to 4 are NaN
. How can I avoid that? For example, row 1 should be -90 8
(-90=-150*0.6; 8=20*0.4).
For example, Numpy may broadcast to match dimensions.
The mul() method of DataFrame object multiplies the elements of a DataFrame object with another DataFrame object, series or any other Python sequence. mul() does an elementwise multiplication of a DataFrame with another DataFrame, a pandas Series or a Python Sequence.
mul() function return multiplication of dataframe and other element- wise. This function essentially does the same thing as the dataframe * other, but it provides an additional support to handle missing values in one of the inputs.
Multiplying of two pandas. Series objects can be done through applying the multiplication operator “*” as well. Through mul() method, handling None values in the data is possible by replacing them with a default value using the parameter fill_value.
multiply() function perform the multiplication of series and other, element-wise. The operation is equivalent to series * other , but with support to substitute a fill_value for missing data in one of the inputs.
Edit: for arbitrary dimension, try using values
to manipulate the values of the dataframes in an array-like fashion:
# element-wise multiplication
frame_score_weighted = frame_score.values*frame_weights.values
# change to pandas dataframe and rename columns
frame_score_weighted = pd.DataFrame(data=frame_score_weighted, columns=['Score1','Score2'])
#Out:
Score1 Score2
0 60.0 32.0
1 -90.0 8.0
2 -66.0 28.0
3 108.0 39.6
4 75.0 8.0
Just use some additional indexing to make sure you extract the desired weights as a scalar when you do the multiplication.
frame_score['Score1'] = frame_score['Score1']*frame_weights['Score1'][0]
frame_score['Score2'] = frame_score['Score2']*frame_weights['Score2'][0]
frame_score
#Out:
Score1 Score2
0 60.0 32.0
1 -90.0 8.0
2 -66.0 28.0
3 108.0 39.6
4 75.0 8.0
By default, when pd.DataFrame
is multiplied by a pd.Series
, pandas
aligns the index of the pd.Series
with the columns of the pd.DataFrame
. So, we get the relevant pd.Series
from frame_weights
by accessing just the first row.
frame_score * frame_weights.loc[0]
Score1 Score2
0 60.0 32.0
1 -90.0 8.0
2 -66.0 28.0
3 108.0 39.6
4 75.0 8.0
You can edit frame_score
in place with
frame_score *= frame_weights.loc[0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With