Adding a 'count' column to the result of a groupby in pandas?

Tags:

python

pandas

I think this is a fairly basic question, but I can't seem to find the solution.

I have a pandas dataframe similar to the following:

import pandas as pd  df = pd.DataFrame({'A' : ['x','x','y','z','z'],                    'B' : ['p','p','q','r','r']}) df

which creates a table like this:

    A   B 0   x   p 1   x   p 2   y   q 3   z   r 4   z   r

I'm trying to create a table that represents the number of distinct values in that dataframe. So my goal is something like this:

    A   B   c 0   x   p   2 1   y   q   1 2   z   r   2

I can't find the correct functions to achieve this, though. I've tried:

df.groupby(['A','B']).agg('count')

This produces a table with 3 rows (as expected) but without a 'count' column. I don't know how to add in that count column. Could someone point me in the right direction?

874

asked Feb 13 '18 15:02

Oliver

2 Answers

You can using size

df.groupby(['A','B']).size() Out[590]:  A  B x  p    2 y  q    1 z  r    2 dtype: int64

For your solution adding one of the columns

df.groupby(['A','B']).B.agg('count') Out[591]:  A  B x  p    2 y  q    1 z  r    2 Name: B, dtype: int64

Update :

df.groupby(['A','B']).B.agg('count').to_frame('c').reset_index()  #df.groupby(['A','B']).size().to_frame('c').reset_index() Out[593]:     A  B  c 0  x  p  2 1  y  q  1 2  z  r  2

answered Oct 17 '22 03:10

BENY

pandas >= 1.1: `DataFrame.value_counts`

This is an identical replacement for df.groupby(['A', 'B']).size().

df.value_counts(['A', 'B'])  A  B z  r    2 x  p    2 y  q    1 dtype: int64

df.value_counts(['A', 'B']).reset_index(name='c')     A  B  c 0  z  r  2 1  x  p  2 2  y  q  1

answered Oct 17 '22 03:10

cs95

Related questions
                            
                                How to substitute multiple symbols in an expression in sympy?
                            
                                Differences between null and NaN in spark? How to deal with it?
                            
                                pandas timestamp series to string?
                            
                                Reordering matrix elements to reflect column and row clustering in naiive python
                            
                                How can I exclude South migrations from coverage reports using coverage.py
                            
                                Can I redirect unicode output from the console directly into a file?
                            
                                How to make FileField in django optional?
                            
                                replace zeroes in numpy array with the median value
                            
                                How to load all entries in an infinite scroll at once to parse the HTML in python
                            
                                Django Rest Framework custom authentication
                            
                                Pytorch: Can't call numpy() on Variable that requires grad. Use var.detach().numpy() instead
                            
                                What is Adaptive average pooling and How does it work?
                            
                                Concatenate generator and item
                            
                                paramiko no existing session exception
                            
                                How to indent the contents of a multi-line string?
                            
                                Python: Array v. List [duplicate]
                            
                                How to set a files owner in python?
                            
                                Sklearn kNN usage with a user defined metric
                            
                                django best approach for creating multiple type users
                            
                                python import of local module failing when run as systemd/systemctl service

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Adding a 'count' column to the result of a groupby in pandas?

Tags:

python

pandas

Oliver

People also ask

2 Answers

BENY

pandas >= 1.1: `DataFrame.value_counts`

cs95

Recent Activity

Donate For Us

Adding a 'count' column to the result of a groupby in pandas?

Tags:

python

pandas

Oliver

People also ask

2 Answers

BENY

pandas >= 1.1: DataFrame.value_counts

cs95

Related questions

Recent Activity

Donate For Us

pandas >= 1.1: `DataFrame.value_counts`