I might be approaching this is the wrong way. If I have two dataframes a and b:
Dataframe A:
a b c
1 2 4
1 6 5
1 8 7
and Dataframe B:
a b c d
1 2 4 9
1 6 5 7
1 8 7 10
And I want to join them, but cruically sum the columns where the column name matchs, and keep the column names that don't match. So the final joined table would look like:
a b c d
2 4 8 9
2 12 10 7
2 16 14 10
Note: The indexs would always match (i.e the same number of records in the same order)
You could call add
on the larger df passing in the other df and call fillna
:
In [18]:
df1.add(df).fillna(df1)
Out[18]:
a b c d
0 2 4 8 9
1 2 12 10 7
2 2 16 14 10
Another way is to use combine_first
:
In [20]:
df1.add(df).combine_first(df1)
Out[20]:
a b c d
0 2 4 8 9
1 2 12 10 7
2 2 16 14 10
You can align the dataframe, fill NAN with zero and perform simple addition.
d = pd.DataFrame(data, columns=['a','b','c'])
d2 = pd.DataFrame(data2, columns=['a','b','c','d'])
d, d2 = d.align(d2, fill_value=0)
sum_df = d + d2
In [23]: d
Out[23]:
a b c d
0 1 2 4 0
1 1 6 5 0
2 1 8 7 0
In [24]: d2
Out[24]:
a b c d
0 1 2 4 9
1 1 6 5 7
2 1 8 7 10
In [25]:sum_df
Out[25]:
a b c d
0 2 4 8 9
1 2 12 10 7
2 2 16 14 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With