Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I multiply two dataframes with different column labels in pandas?

Tags:

python

pandas

I'm trying to multiply (add/divide/etc.) two dataframes that have different column labels.

I'm sure this is possible, but what's the best way to do it? I've tried using rename to change the columns on one df first, but (1) I'd rather not do that and (2) my real data has a multiindex on the columns (where only one layer of the multiindex is differently labeled), and rename seems tricky for that case...

So to try and generalize my question, how can I get df1 * df2 using map to define the columns to multiply together?

df1 = pd.DataFrame([1,2,3], index=['1', '2', '3'], columns=['a', 'b', 'c'])
df2 = pd.DataFrame([4,5,6], index=['1', '2', '3'], columns=['d', 'e', 'f'])
map = {'a': 'e', 'b': 'd', 'c': 'f'}

df1 * df2 = ?
like image 340
jmloser Avatar asked Sep 20 '12 22:09

jmloser


People also ask

Can you multiply two Dataframes in pandas?

The mul() method of DataFrame object multiplies the elements of a DataFrame object with another DataFrame object, series or any other Python sequence. mul() does an elementwise multiplication of a DataFrame with another DataFrame, a pandas Series or a Python Sequence.

How do I merge multiple Dataframes in pandas based on a column?

Pandas merge() function is used to merge multiple Dataframes. We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes.


3 Answers

I was also troubled by this problem. It seems that the pandas requires matrix multiply needs both dataframes has same column names.

I searched a lot and found the example in the setting enlargement is add one column to the dataframe.

For your question,

rs = pd.np.multiply(ds2, ds1)

The rs will have the same column names as ds2.

Suppose we want to multiply several columns with other serveral columns in the same dataframe and append these results into the original dataframe.

For example ds1,ds2 are in the same dataframe ds. We can

ds[['r1', 'r2', 'r3']] = pd.np.multiply(ds[['a', 'b', 'c']], ds[['d', 'e', 'f']])

I hope these will help.

like image 183
BearPy Avatar answered Sep 28 '22 03:09

BearPy


Updated solution now that pd.np is being deprecated: df1.multiply(np.array(df2)

It will keep the column names of df1 and multiply them by the columns of df2 in order

like image 31
nnsk Avatar answered Sep 28 '22 03:09

nnsk


I just stumbled onto the same problem. It seems like pandas wants both the column and row index to be aligned to do the element-wise multiplication, so you can just rename with your mapping during the multiplication:

>>> df1 = pd.DataFrame([[1,2,3]], index=['1', '2', '3'], columns=['a', 'b', 'c'])
>>> df2 = pd.DataFrame([[4,5,6]], index=['1', '2', '3'], columns=['d', 'e', 'f'])
>>> df1
   a  b  c
1  1  2  3
2  1  2  3
3  1  2  3
>>> df2
   d  e  f
1  4  5  6
2  4  5  6
3  4  5  6
>>> mapping = {'a' : 'e', 'b' : 'd', 'c' : 'f'}
>>> df1.rename(columns=mapping) * df2
   d  e   f
1  8  5  18
2  8  5  18
3  8  5  18

If you want the 'natural' order of columns, you can create a mapping on the fly like:

>>> df1 * df2.rename(columns=dict(zip(df2.columns, df1.columns)))

for example to do the "Frobenius inner product" of the two matrices, you could do:

>>> (df1 * df2.rename(columns=dict(zip(df2.columns, df1.columns)))).sum().sum()
96
like image 31
patricksurry Avatar answered Sep 28 '22 04:09

patricksurry