I have two dataframes: df1 and df2.
df1
Index date | X1 | X2
0 01-01-2020 | H | 10
1 01-02-2020 | H | 30
2 01-03-2020 | Y | 15
3 01-04-2020 | Y | 20
df2
Index | X1 | X2
0 | H | 5
1 | Y | 10
I want to multiply df1 and df2 when the value on column X1 matches.
Desired result:
Index date | X1 | X2
0 01-01-2020 | H | 50
1 01-02-2020 | H | 150
2 01-03-2020 | Y | 150
3 01-04-2020 | Y | 200
The mul() method of DataFrame object multiplies the elements of a DataFrame object with another DataFrame object, series or any other Python sequence. mul() does an elementwise multiplication of a DataFrame with another DataFrame, a pandas Series or a Python Sequence.
We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes. Merging multiple Dataframes is similar to SQL join and supports different types of join inner , left , right , outer , cross .
It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.
The mul() method multiplies each value in the DataFrame with a specified value. The specified value must be an object that can be multiplied with the values of the DataFrame.
Applying an IF condition under an existing DataFrame column. 1 If the number is equal to 0, then change the value to 999. 2 If the number is equal to 5, then change the value to 555 import pandas as pd numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10,0,0]} df = pd.
The reason is dataframe may be having multiple columns and multiple rows. Selective display of columns with limited rows is always the expected view of users. To fulfill the user’s expectations and also help in machine deep learning scenarios, filtering of Pandas dataframe with multiple conditions is much necessary.
Alternatively, you may store the results under an existing DataFrame column. For example, let’s say that you created a DataFrame that has 12 numbers, where the last two numbers are zeros: You may then apply the following IF conditions, and then store the results under the existing ‘set_of_numbers’ column:
You can try to print the data frame and it will show you two columns as: print(data_frame) print (data_frame) print (data_frame) Output: col1 col2 0 10 40 1 20 50 2 30 60. Now you can just use the “*” operator between column one and column two of the data frame as: data_frame["col1*col2"] = data_frame["col1"] * data_frame["col2"]
Use Series.map
for match by X1
and then multiple by X2
:
df1['X2'] *= df1['X1'].map(df2.set_index('X1')['X2'])
print (df1)
date X1 X2
0 01-01-2020 H 50
1 01-02-2020 H 150
2 01-03-2020 Y 150
3 01-04-2020 Y 200
Or use DataFrame.merge
with left join:
df1['X2'] *= df2.merge(df1, on='X1', how='left')['X2_x']
print (df1)
date X1 X2
0 01-01-2020 H 50
1 01-02-2020 H 150
2 01-03-2020 Y 150
3 01-04-2020 Y 200
You can set the index on both dataframes and assign the array to df
:
df["X2"] = df.set_index("X1").X2.mul(df1.set_index("X1").X2).array
df
date X1 X2
0 01-01-2020 H 50
1 01-02-2020 H 150
2 01-03-2020 Y 150
3 01-04-2020 Y 200
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With