Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum dataframes to MultiIndex

I have two DataFrames with different indexes like:

import pandas as pd
a = pd.DataFrame([1, 2, 3], index=['a', 'b', 'c'],
columns=['one'])
b = pd.DataFrame([5, 6], index=['d', 'e'],
columns=['two'])

And i can create "Cartesian" MultiIndex using:

a_plus_b = pd.MultiIndex.from_product([a.index,b.index])

Which turns to an empty MultiIndex:

MultiIndex(levels=[['a', 'b', 'c'], ['d', 'e']],
       labels=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 0, 1]])

How to create Cartesian sum like the following?

'a' 'd' 6 # 1 + 5
    'e' 7 # 1 + 6
'b' 'd' 7 # 2 + 5
    'e' 8 # 2 + 6
'c' 'd' 8 # 3 + 5
    'e' 9 # 3 + 6
like image 398
totikom Avatar asked May 18 '26 21:05

totikom


2 Answers

Use reindex by first and second level:

s = a['one'].reindex(a_plus_b, level=0) + b['two'].reindex(a_plus_b, level=1)
print (s)
a  d    6
   e    7
b  d    7
   e    8
c  d    8
   e    9
dtype: int64
like image 194
jezrael Avatar answered May 21 '26 15:05

jezrael


You can avoid the intermediary step of creating a MultiIndex explicitly by using pd.merge:

res = pd.merge(a.rename_axis('A').reset_index().assign(key=1),
               b.rename_axis('B').reset_index().assign(key=1), on='key')

res = res.assign(total=res['one'] + res['two'])\
         .groupby(['A', 'B'])['total'].sum()

print(res)

A  B
a  d    6
   e    7
b  d    7
   e    8
c  d    8
   e    9
Name: total, dtype: int64
like image 31
jpp Avatar answered May 21 '26 14:05

jpp



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!