In Pandas, there is a very clean way to count the distinct values in a column within a group by operation. For example <pre class="prettyprint"><code>ex = pd.DataFrame([[1, 2, 3], [6, 7, 8], [1, 7, 9]], columns=["A", "B", "C"]).set_index(["A", "B"]) ex.groupby(level="A").C.nunique() </code></pre> will return <pre class="prettyprint"><code>A 1 2 6 1 Name: C, dtype: int64 </code></pre> I would also like to count the distinct values in index level <code>B</code> while grouping by <code>A</code>. I can't find a clean way to access the levels of <code>B</code> from the <code>groupby</code> object. The best I've been able to come up with is: <pre class="prettyprint"><code>ex.reset_index("B", drop=False).groupby(level="A").B.nunique() </code></pre> which correctly returns: <pre class="prettyprint"><code>A 1 2 6 1 Name: B, dtype: int64 </code></pre> Is there a way for me to do this on the groupby without resetting the index or using an <code>apply</code> function?

IIUC you could do <code>reset_index</code> for all levels, then groupby be 'A' and apply <code>nunique</code> method: <pre class="prettyprint"><code>res = ex.reset_index().groupby('A').agg(lambda x: x.nunique()) In [339]: res Out[339]: B C A 1 2 2 6 1 1 </code></pre> Same solution with <code>pivot_table</code>: <pre class="prettyprint"><code>In [341]: ex.reset_index().pivot_table(index='A', aggfunc=lambda x: x.nunique()) Out[341]: B C A 1 2 2 6 1 1 </code></pre>

Counting unique index values in Pandas groupby

Tags:

python

pandas

In Pandas, there is a very clean way to count the distinct values in a column within a group by operation. For example

Click to copy

ex = pd.DataFrame([[1, 2, 3], [6, 7, 8], [1, 7, 9]], 
                  columns=["A", "B", "C"]).set_index(["A", "B"])
ex.groupby(level="A").C.nunique()

will return

Click to copy

A
1    2
6    1
Name: C, dtype: int64

I would also like to count the distinct values in index level B while grouping by A. I can't find a clean way to access the levels of B from the groupby object. The best I've been able to come up with is:

Click to copy

ex.reset_index("B", drop=False).groupby(level="A").B.nunique()

which correctly returns:

Click to copy

A
1    2
6    1
Name: B, dtype: int64

Is there a way for me to do this on the groupby without resetting the index or using an apply function?

424

asked Feb 03 '16 13:02

Tim Hopper

1 Answers

IIUC you could do reset_index for all levels, then groupby be 'A' and apply nunique method:

Click to copy

res = ex.reset_index().groupby('A').agg(lambda x: x.nunique())

In [339]: res
Out[339]:
   B  C
A
1  2  2
6  1  1

Same solution with pivot_table:

Click to copy

In [341]: ex.reset_index().pivot_table(index='A', aggfunc=lambda x: x.nunique())
Out[341]:
   B  C
A
1  2  2
6  1  1

answered Sep 28 '22 06:09

Anton Protopopov

Related questions
                            
                                (Flask) Faking request.environ['REMOTE_USER'] for testing
                            
                                Get Attribute type of a model in Django
                            
                                Imported python module overrides option parser
                            
                                Plot arbitrary 2-D function in python/pyplot like Matlab's Ezplot
                            
                                Python PIL image saving
                            
                                Generate 1d numpy with chunks of random length
                            
                                Python use split with arrays
                            
                                reclassify a numpy array in python between a range
                            
                                How to disable SSL3 and weak ciphers with cherrypy builtin ssl module (python 3)
                            
                                mypy not detecting a basic type error
                            
                                Install and find shared library with conda
                            
                                How to split sub-lists into sub-lists k times? (Python)
                            
                                Unexpected behavior of python builtin str function
                            
                                Connect to Amazon S3 with boto3 with IAM roles
                            
                                Renormalize weight matrix using TensorFlow
                            
                                Why Stanford parser with nltk is not correctly parsing a sentence?
                            
                                Flask Restful search query
                            
                                How do I fit a quadratic surface to some points in Python?
                            
                                Using googleapiclient to send an email draft by Id
                            
                                What's the difference between a 'function', 'method' and 'bound method' in Python 3?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Counting unique index values in Pandas groupby

Tags:

python

pandas

Tim Hopper

People also ask

1 Answers

Anton Protopopov

Recent Activity

Donate For Us