I have a pandas data frame and group it by two columns (for example col1
and col2
). For fixed values of col1
and col2
(i.e. for a group) I can have several different values in the col3
. I would like to count the number of distinct values from the third columns.
For example, If I have this as my input:
1 1 1 1 1 1 1 1 2 1 2 3 1 2 3 1 2 3 2 1 1 2 1 2 2 1 3 2 2 3 2 2 3 2 2 3
I would like to have this table (data frame) as the output:
1 1 2 1 2 1 2 1 3 2 2 1
Pandas DataFrame count() Method The count() method counts the number of not empty values for each row, or column if you specify the axis parameter as axis='columns' , and returns a Series object with the result for each row (or column).
df.groupby(['col1','col2'])['col3'].nunique().reset_index()
In [17]: df Out[17]: 0 1 2 0 1 1 1 1 1 1 1 2 1 1 2 3 1 2 3 4 1 2 3 5 1 2 3 6 2 1 1 7 2 1 2 8 2 1 3 9 2 2 3 10 2 2 3 11 2 2 3 In [19]: df.groupby([0,1])[2].apply(lambda x: len(x.unique())) Out[19]: 0 1 1 1 2 2 1 2 1 3 2 1 dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With