I am using Pandas as a database substitute as I have multiple databases (Oracle, SQL Server, etc.), and I am unable to make a sequence of commands to a SQL equivalent. I have a table loaded in a DataFrame with some columns: <pre class="prettyprint"><code>YEARMONTH, CLIENTCODE, SIZE, etc., etc. </code></pre> In SQL, to count the amount of different clients per year would be: <pre class="prettyprint"><code>SELECT count(distinct CLIENTCODE) FROM table GROUP BY YEARMONTH; </code></pre> And the result would be <pre class="prettyprint"><code>201301 5000 201302 13245 </code></pre> How can I do that in Pandas?

I believe this is what you want: <pre class="prettyprint"><code>table.groupby('YEARMONTH').CLIENTCODE.nunique() </code></pre> Example: <pre class="prettyprint"><code>In [2]: table Out[2]: CLIENTCODE YEARMONTH 0 1 201301 1 1 201301 2 2 201301 3 1 201302 4 2 201302 5 2 201302 6 3 201302 In [3]: table.groupby('YEARMONTH').CLIENTCODE.nunique() Out[3]: YEARMONTH 201301 2 201302 3 </code></pre>

Pandas 'count(distinct)' equivalent

Tags:

python

pandas

count

group-by

distinct

I am using Pandas as a database substitute as I have multiple databases (Oracle, SQL Server, etc.), and I am unable to make a sequence of commands to a SQL equivalent.

I have a table loaded in a DataFrame with some columns:

YEARMONTH, CLIENTCODE, SIZE, etc., etc.

In SQL, to count the amount of different clients per year would be:

SELECT count(distinct CLIENTCODE) FROM table GROUP BY YEARMONTH;

And the result would be

201301    5000 201302    13245

How can I do that in Pandas?

659

asked Mar 14 '13 13:03

Adriano Almeida

Video Answer

1 Answers

I believe this is what you want:

table.groupby('YEARMONTH').CLIENTCODE.nunique()

Example:

In [2]: table Out[2]:     CLIENTCODE  YEARMONTH 0           1     201301 1           1     201301 2           2     201301 3           1     201302 4           2     201302 5           2     201302 6           3     201302  In [3]: table.groupby('YEARMONTH').CLIENTCODE.nunique() Out[3]:  YEARMONTH 201301       2 201302       3

128

answered Sep 21 '22 11:09

Dan Allan

Related questions
                            
                                How to test if a dictionary contains a specific key? [duplicate]
                            
                                Python idiom to return first item or None
                            
                                Pandas index column title or name
                            
                                Loop backwards using indices in Python?
                            
                                "pip install unroll": "python setup.py egg_info" failed with error code 1
                            
                                How to use filter, map, and reduce in Python 3
                            
                                What does asterisk * mean in Python? [duplicate]
                            
                                Get the row(s) which have the max value in groups using groupby
                            
                                Is it possible only to declare a variable without assigning any value in Python?
                            
                                Python strftime - date without leading 0?
                            
                                How to start a background process in Python?
                            
                                Join a list of items with different types as string in Python
                            
                                How can I display full (non-truncated) dataframe information in HTML when converting from Pandas dataframe to HTML?
                            
                                Normalize columns of pandas data frame
                            
                                Total memory used by Python process?
                            
                                Convert a python dict to a string and back
                            
                                Finding and replacing elements in a list
                            
                                Django Model() vs Model.objects.create()
                            
                                Bare asterisk in function arguments?
                            
                                What does axis in pandas mean?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With