I have two dataframes like these: <pre class="prettyprint"><code>df1 = pd.DataFrame({'A': [1,0,3], 'B':[0,0,1], 'C':[0,2,2]}, index =['a','b','c']) df2 = pd.DataFrame({'A': [0,0], 'B':[2,1]}, index =['a','c']) </code></pre> df1 and df2: <pre class="prettyprint"><code> | A | B | C | | A | B | ---|---|---|---| ---|---|---| a | 1 | 0 | 0 | a | 0 | 2 | b | 0 | 0 | 2 | c | 0 | 1 | c | 3 | 1 | 2 | </code></pre> And the expected result is: <pre class="prettyprint"><code> | A | B | C | ---|---|---|---| a | 1 | 2 | 0 | b | 0 | 0 | 2 | c | 3 | 2 | 2 | </code></pre> I'm having problems with this because there may be missing columns/rows in any of the dataframes (df1 may not have all columns and rows df2 has)

Going by the idea in the answer for this question - merge 2 dataframes in Pandas: join on some columns, sum up others Since in your case, the indexes are the ones that are common, you can use <code>pandas.concat()</code> for the two DataFrames, then <code>DataFrame.groupby</code> based on the index, and then take sum on it. Example - <pre class="prettyprint"><code>In [27]: df1 Out[27]: A B C a 1 0 0 b 0 0 2 c 3 1 2 In [28]: df2 Out[28]: A B a 0 2 c 0 1 In [29]: pd.concat([df1,df2]).groupby(level=0).sum() Out[29]: A B C a 1 2 0 b 0 0 2 c 3 2 2 </code></pre>

Combine two pandas dataframes adding corresponding values

Tags:

python

pandas

data-analysis

I have two dataframes like these:

df1 = pd.DataFrame({'A': [1,0,3], 'B':[0,0,1], 'C':[0,2,2]}, index =['a','b','c'])
df2 = pd.DataFrame({'A': [0,0], 'B':[2,1]}, index =['a','c'])

df1 and df2:

   | A | B | C |          | A | B |    
---|---|---|---|       ---|---|---|
 a | 1 | 0 | 0 |        a | 0 | 2 |   
 b | 0 | 0 | 2 |        c | 0 | 1 |
 c | 3 | 1 | 2 |

And the expected result is:

   | A | B | C |
---|---|---|---|
 a | 1 | 2 | 0 |
 b | 0 | 0 | 2 |
 c | 3 | 2 | 2 |

I'm having problems with this because there may be missing columns/rows in any of the dataframes (df1 may not have all columns and rows df2 has)

810

asked Sep 30 '15 14:09

kiril

1 Answers

Going by the idea in the answer for this question - merge 2 dataframes in Pandas: join on some columns, sum up others

Since in your case, the indexes are the ones that are common, you can use pandas.concat() for the two DataFrames, then DataFrame.groupby based on the index, and then take sum on it. Example -

In [27]: df1
Out[27]:
   A  B  C
a  1  0  0
b  0  0  2
c  3  1  2

In [28]: df2
Out[28]:
   A  B
a  0  2
c  0  1

In [29]: pd.concat([df1,df2]).groupby(level=0).sum()
Out[29]:
   A  B  C
a  1  2  0
b  0  0  2
c  3  2  2

184

answered Sep 22 '22 14:09

Anand S Kumar

Related questions
                            
                                merge few pivot tables in pandas
                            
                                Pandas: index of max value for each group
                            
                                How to match double quote in python regex?
                            
                                how to extract token from string in python?
                            
                                Compile numpy WITHOUT Intel MKL/BLAS/ATLAS/LAPACK
                            
                                Fill missing timeseries data using pandas or numpy
                            
                                Using memmap files for batch processing
                            
                                All addresses to go to a single page (catch-all route to a single view) in Python Pyramid
                            
                                getting a papers references using Elsevier Scopus API
                            
                                How can I find null values with SELECT query in psycopg?
                            
                                Why my bokeh plots doesn't work on github?
                            
                                How to check if a SciPy CSR matrix is empty (i.e. contains only zeroes)?
                            
                                Methods don't chain in Python set
                            
                                Search and filter pandas dataframe with regular expressions
                            
                                Django two self-referential foreign key [duplicate]
                            
                                How do I add nested categories to a Django model?
                            
                                What is the difference between various methods of creating a User object in django?
                            
                                How to execute python program using a shell script (and makefile?)
                            
                                Python: How can i send multiple HTTP requests and receive the response?
                            
                                How to generate list of random integers, but only using specified integers? (Python) [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With