A similar question might have been asked before, but I couldn't find the exact one fitting to my problem. I want to group by a dataframe based on two columns. For exmaple to make this <pre class="prettyprint"><code>id product quantity 1 A 2 1 A 3 1 B 2 2 A 1 2 B 1 3 B 2 3 B 1 </code></pre> Into this: <pre class="prettyprint"><code>id product quantity 1 A 5 1 B 2 2 A 1 2 B 1 3 B 3 </code></pre> Meaning that summation on "quantity" column for same "id" and same "product".

You can use <code>pivot_table</code> with <code>aggfunc='sum'</code> <pre class="prettyprint"><code>df.pivot_table('quantity', ['id', 'product'], aggfunc='sum').reset_index() id product quantity 0 1 A 5 1 1 B 2 2 2 A 1 3 2 B 1 4 3 B 3 </code></pre>

How to groupby based on two columns in pandas?

Tags:

python

pandas

dataframe

group-by

pandas-groupby

A similar question might have been asked before, but I couldn't find the exact one fitting to my problem. I want to group by a dataframe based on two columns. For exmaple to make this

id product quantity
1  A       2
1  A       3
1  B       2
2  A       1
2  B       1
3  B       2
3  B       1

Into this:

id product quantity
1  A       5
1  B       2
2  A       1
2  B       1
3  B       3

Meaning that summation on "quantity" column for same "id" and same "product".

267

asked Apr 05 '17 04:04

ARASH

2 Answers

You need groupby with parameter as_index=False for return DataFrame and aggregating mean:

df = df.groupby(['id','product'], as_index=False)['quantity'].sum()
print (df)
   id product  quantity
0   1       A         5
1   1       B         2
2   2       A         1
3   2       B         1
4   3       B         3

Or add reset_index:

df = df.groupby(['id','product'])['quantity'].sum().reset_index()
print (df)
   id product  quantity
0   1       A         5
1   1       B         2
2   2       A         1
3   2       B         1
4   3       B         3

199

answered Oct 05 '22 12:10

jezrael

You can use pivot_table with aggfunc='sum'

df.pivot_table('quantity', ['id', 'product'], aggfunc='sum').reset_index()

   id product  quantity
0   1       A         5
1   1       B         2
2   2       A         1
3   2       B         1
4   3       B         3

answered Oct 05 '22 14:10

piRSquared

Related questions
                            
                                Send Custom message in Django PermissionDenied
                            
                                `id` function in Python 2.7, `is` operator, object identity and user-defined methods [duplicate]
                            
                                TypeError: the first argument must be callable
                            
                                Spyder IDE Console History
                            
                                Django Rest Framework {"detail":"Authentication credentials were not provided."}
                            
                                map_async vs apply_async:what should I use in this case
                            
                                Wagtail: Display a list of child pages inside a parent page
                            
                                Networkx never finishes calculating Betweenness centrality for 2 mil nodes
                            
                                Bad JSON - Keys are not quoted
                            
                                Not finding static files django 1.9 gunicorn
                            
                                pandas: how to find the most frequent value of each row?
                            
                                'PySide.QtCore.Signal' object has no attribute 'connect'
                            
                                Python Lambda Function Parsing DynamoDB's JSON Format
                            
                                Python requests call with URL using parameters
                            
                                how to compare two columns in pandas to make a third column ?
                            
                                How to set coordinates when cropping an image with PIL?
                            
                                Get Scrapy crawler output/results in script file function
                            
                                Pandas dataframe to count matrix
                            
                                How to print multiple non-consecutive values from a list with Python 3.5.1
                            
                                Finding All The Keys With the Same Value in a Python Dictionary [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With