Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine two pandas dataframes adding corresponding values

I have two dataframes like these:

df1 = pd.DataFrame({'A': [1,0,3], 'B':[0,0,1], 'C':[0,2,2]}, index =['a','b','c'])
df2 = pd.DataFrame({'A': [0,0], 'B':[2,1]}, index =['a','c'])

df1 and df2:

   | A | B | C |          | A | B |    
---|---|---|---|       ---|---|---|
 a | 1 | 0 | 0 |        a | 0 | 2 |   
 b | 0 | 0 | 2 |        c | 0 | 1 |
 c | 3 | 1 | 2 |

And the expected result is:

   | A | B | C |
---|---|---|---|
 a | 1 | 2 | 0 |
 b | 0 | 0 | 2 |
 c | 3 | 2 | 2 |

I'm having problems with this because there may be missing columns/rows in any of the dataframes (df1 may not have all columns and rows df2 has)

like image 810
kiril Avatar asked Sep 30 '15 14:09

kiril


People also ask

How do you join two DataFrames based on common column?

To merge two Pandas DataFrame with common column, use the merge() function and set the ON parameter as the column name.

How do you join two DataFrames by 2 columns so they have only the common rows?

How do I join two DataFrames based on two columns? The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.

Can you combine two DataFrames in pandas?

Pandas' merge and concat can be used to combine subsets of a DataFrame, or even data from different files. join function combines DataFrames based on index or column. Joining two DataFrames can be done in multiple ways (left, right, and inner) depending on what data must be in the final DataFrame.


1 Answers

Going by the idea in the answer for this question - merge 2 dataframes in Pandas: join on some columns, sum up others

Since in your case, the indexes are the ones that are common, you can use pandas.concat() for the two DataFrames, then DataFrame.groupby based on the index, and then take sum on it. Example -

In [27]: df1
Out[27]:
   A  B  C
a  1  0  0
b  0  0  2
c  3  1  2

In [28]: df2
Out[28]:
   A  B
a  0  2
c  0  1

In [29]: pd.concat([df1,df2]).groupby(level=0).sum()
Out[29]:
   A  B  C
a  1  2  0
b  0  0  2
c  3  2  2
like image 184
Anand S Kumar Avatar answered Sep 22 '22 14:09

Anand S Kumar