Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count distinct values in a column of a pandas group by object?

Tags:

I have a pandas data frame and group it by two columns (for example col1 and col2). For fixed values of col1 and col2 (i.e. for a group) I can have several different values in the col3. I would like to count the number of distinct values from the third columns.

For example, If I have this as my input:

1  1  1 1  1  1 1  1  2 1  2  3 1  2  3 1  2  3 2  1  1 2  1  2 2  1  3 2  2  3 2  2  3 2  2  3 

I would like to have this table (data frame) as the output:

1  1  2 1  2  1 2  1  3 2  2  1 
like image 226
Roman Avatar asked Jul 29 '13 14:07

Roman


People also ask

How do you count objects in pandas?

Pandas DataFrame count() Method The count() method counts the number of not empty values for each row, or column if you specify the axis parameter as axis='columns' , and returns a Series object with the result for each row (or column).


2 Answers

df.groupby(['col1','col2'])['col3'].nunique().reset_index() 
like image 126
Roman Avatar answered Oct 19 '22 00:10

Roman


In [17]: df Out[17]:      0  1  2 0   1  1  1 1   1  1  1 2   1  1  2 3   1  2  3 4   1  2  3 5   1  2  3 6   2  1  1 7   2  1  2 8   2  1  3 9   2  2  3 10  2  2  3 11  2  2  3  In [19]: df.groupby([0,1])[2].apply(lambda x: len(x.unique())) Out[19]:  0  1 1  1    2    2    1 2  1    3    2    1 dtype: int64 
like image 24
Jeff Avatar answered Oct 19 '22 01:10

Jeff