This code:
df2 = ( pd.DataFrame({ 'X' : ['X1', 'X1', 'X1', 'X1'], 'Y' : ['Y2', 'Y1', 'Y1', 'Y1'], 'Z' : ['Z3', 'Z1', 'Z1', 'Z2'] }) ) g = df2.groupby('X') pd.pivot_table(g, values='X', rows='Y', cols='Z', margins=False, aggfunc='count')
returns the following error:
Traceback (most recent call last): ... AttributeError: 'Index' object has no attribute 'index'
How do I get a Pivot Table with counts of unique values of one DataFrame column for two other columns?
Is there aggfunc
for count unique? Should I be using np.bincount()
?
NB. I am aware of pandas.Series.values_counts()
however I need a pivot table.
EDIT: The output should be:
Z Z1 Z2 Z3 Y Y1 1 1 NaN Y2 NaN NaN 1
You can get the count distinct values (equivalent to SQL count(distinct) ) in pandas using DataFrame. groupby(), nunique() , DataFrame. agg(), DataFrame.
Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.
Do you mean something like this?
>>> df2.pivot_table(values='X', index='Y', columns='Z', aggfunc=lambda x: len(x.unique())) Z Z1 Z2 Z3 Y Y1 1 1 NaN Y2 NaN NaN 1
Note that using len
assumes you don't have NA
s in your DataFrame. You can do x.value_counts().count()
or len(x.dropna().unique())
otherwise.
This is a good way of counting entries within .pivot_table
:
>>> df2.pivot_table(values='X', index=['Y','Z'], columns='X', aggfunc='count') X1 X2 Y Z Y1 Z1 1 1 Z2 1 NaN Y2 Z3 1 NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With