Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame merge summing column

I'm trying to merge two DataFrames summing columns value.

>>> print(df1)    id name  weight 0   1    A       0 1   2    B      10 2   3    C      10  >>> print(df2)    id name  weight 0   2    B      15 1   3    C      10 

I need to sum weight values during merging for similar values in the common column.

merge = pd.merge(df1, df2, how='inner') 

So the output will be something like following.

   id name  weight 1   2    B      25 2   3    C      20 
like image 579
Nilani Algiriyage Avatar asked Apr 29 '14 09:04

Nilani Algiriyage


People also ask

How do I sum columns together in pandas?

sum() to Sum All Columns. Use DataFrame. sum() to get sum/total of a DataFrame for both rows and columns, to get the total sum of columns use axis=1 param. By default, this method takes axis=0 which means summing of rows.


2 Answers

This solution works also if you want to sum more than one column. Assume data frames

>>> df1    id name  weight  height 0   1    A       0       5 1   2    B      10      10 2   3    C      10      15 >>> df2    id name  weight  height 0   2    B      25      20 1   3    C      20      30 

You can concatenate them and group by index columns.

>>> pd.concat([df1, df2]).groupby(['id', 'name']).sum().reset_index()    id name  weight  height 0   1    A       0       5 1   2    B      35      30 2   3    C      30      45 
like image 144
Jan Kislinger Avatar answered Sep 19 '22 03:09

Jan Kislinger


In [41]: pd.merge(df1, df2, on=['id', 'name']).set_index(['id', 'name']).sum(axis=1) Out[41]:  id  name 2   B       25 3   C       20 dtype: int64 
like image 43
waitingkuo Avatar answered Sep 22 '22 03:09

waitingkuo