Beginner question. This seems like it should be a straightforward operation, but I can't figure it out from reading the docs. I have a df with this structure: <pre class="prettyprint"><code>|integer_id|int_field_1|int_field_2| </code></pre> The integer_id column is non-unique, so I'd like to group the df by integer_id and sum the two fields. The equivalent SQL is: <pre class="prettyprint"><code>SELECT integer_id, SUM(int_field_1), SUM(int_field_2) FROM tbl GROUP BY integer_id </code></pre> Any suggestions on the simplest way to do this? EDIT: Including input/output <pre class="prettyprint"><code>Input: integer_id int_field_1 int_field_2 2656 36 36 2656 36 36 9702 2 2 9702 1 1 </code></pre> Ouput using df.groupby('integer_id').sum(): <pre class="prettyprint"><code>integer_id int_field_1 int_field_2 2656 72 72 9702 3 3 </code></pre>

You just need to call <code>sum</code> on a <code>groupby</code> object: <pre class="prettyprint"><code>df.groupby('integer_id').sum() </code></pre> See the docs for further examples

Pandas group by and sum two columns

Tags:

python

pandas

Beginner question. This seems like it should be a straightforward operation, but I can't figure it out from reading the docs.

I have a df with this structure:

|integer_id|int_field_1|int_field_2|

The integer_id column is non-unique, so I'd like to group the df by integer_id and sum the two fields.

The equivalent SQL is:

SELECT integer_id, SUM(int_field_1), SUM(int_field_2) FROM tbl
GROUP BY integer_id

Any suggestions on the simplest way to do this?

EDIT: Including input/output

Input:  
integer_id  int_field_1 int_field_2   
2656        36          36  
2656        36          36  
9702        2           2  
9702        1           1

Ouput using df.groupby('integer_id').sum():

integer_id  int_field_1 int_field_2  
2656        72          72  
9702        3           3

302

asked Aug 27 '14 20:08

acpigeon

1 Answers

You just need to call sum on a groupby object:

df.groupby('integer_id').sum()

See the docs for further examples

answered Oct 03 '22 15:10

EdChum

Related questions
                            
                                python import nested classes shorthand
                            
                                Python, sort a list by another list [duplicate]
                            
                                Python requests Post request data with Django
                            
                                Bind function to Kivy button
                            
                                Using Python To Autofit All Columns of an Excel Sheet
                            
                                Unresolved external symbols building Python C extension
                            
                                how connect to vertica using pyodbc
                            
                                Package a command line application for distribution?
                            
                                pandas: sort each column individually
                            
                                How to compare two timezones in python?
                            
                                Python: Length of longest common subsequence of lists
                            
                                How to use "suggest" in elasticsearch pyes?
                            
                                Sort list of strings and place numbers after letters in python
                            
                                Get number of busy CPUs in Python
                            
                                Can't remove line breaks from BeautifulSoup text output (Python 2.7.5)
                            
                                Python list inside string to list [duplicate]
                            
                                Should I define functions inside or outside of main()?
                            
                                Getting standard error associated with parameter estimates from scipy.optimize.curve_fit
                            
                                sqlalchemy, hybrid property case statement
                            
                                Panda's read_csv always crashes on small file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With