Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas : pivot table with aggfunc = count unique distinct

This code:

df2 = (     pd.DataFrame({         'X' : ['X1', 'X1', 'X1', 'X1'],          'Y' : ['Y2', 'Y1', 'Y1', 'Y1'],          'Z' : ['Z3', 'Z1', 'Z1', 'Z2']     }) ) g = df2.groupby('X') pd.pivot_table(g, values='X', rows='Y', cols='Z', margins=False, aggfunc='count') 

returns the following error:

Traceback (most recent call last): ...  AttributeError: 'Index' object has no attribute 'index' 

How do I get a Pivot Table with counts of unique values of one DataFrame column for two other columns?
Is there aggfunc for count unique? Should I be using np.bincount()?

NB. I am aware of pandas.Series.values_counts() however I need a pivot table.


EDIT: The output should be:

Z   Z1  Z2  Z3 Y              Y1   1   1 NaN Y2 NaN NaN   1 
like image 801
dmi Avatar asked Oct 12 '12 13:10

dmi


People also ask

How do I count unique values in a pivot table using pandas?

You can get the count distinct values (equivalent to SQL count(distinct) ) in pandas using DataFrame. groupby(), nunique() , DataFrame. agg(), DataFrame.

How do you count occurrences in a DataFrame in Python?

Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.


2 Answers

Do you mean something like this?

>>> df2.pivot_table(values='X', index='Y', columns='Z', aggfunc=lambda x: len(x.unique()))  Z   Z1  Z2  Z3 Y              Y1   1   1 NaN Y2 NaN NaN   1 

Note that using len assumes you don't have NAs in your DataFrame. You can do x.value_counts().count() or len(x.dropna().unique()) otherwise.

like image 76
Chang She Avatar answered Sep 23 '22 22:09

Chang She


This is a good way of counting entries within .pivot_table:

>>> df2.pivot_table(values='X', index=['Y','Z'], columns='X', aggfunc='count')          X1  X2 Y   Z        Y1  Z1   1   1     Z2   1  NaN Y2  Z3   1  NaN 
like image 27
julian peng Avatar answered Sep 21 '22 22:09

julian peng