Pandas Pivot Table Dictionary of Agg function
I am trying to calculate 3 aggregative
functions during pivoting:
This is the code:
n_page = (pd.pivot_table(Main_DF,
values='SPC_RAW_VALUE',
index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'],
columns=['LOT_VIRTUAL_LINE'],
aggfunc={'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
.reset_index()
)
Error I am getting is: KeyError: 'Mean'
How can I calculate those 3 functions?
aggfunc : It is an aggregation function and we can set this param with a list of functions, dict, default is numpy. mean. If it is set to a list of functions, the resulting pivot table forms a hierarchical column and this list of functions will be a top-level column.
Right-click on your pivot table and choose Refresh to make the duplicate values appear. Should you encounter this situation in the future, an easy fix is shown in Figure 5: In any version of Excel: Select column A, choose Data, Text to Columns, and then Finish.
Repeated labels are shown only when the PivotTable is in tabular form. They are not shown when compact form or outline form are applied. If you need to, you can change the format in Report layout.
As written in approved answer by @Happy001, aggfunc
cant take dict
is false. we can actually pass the dict
to aggfunc
.
A really handy feature is the ability to pass a dictionary
to the aggfunc
so you can perform different functions on each of the values you select.
for example:
import pandas as pd
import numpy as np
df = pd.read_excel('sales-funnel.xlsx') #loading xlsx file
table = pd.pivot_table(df, index=['Manager', 'Status'], columns=['Product'], values=['Quantity','Price'],
aggfunc={'Quantity':len,'Price':[np.sum, np.mean]},fill_value=0)
table
In the above code, I am passing dictionary
to the aggfunc
and performing len
operation on Quantity
and mean
, sum
operations on Price
.
Here is the output attaching:
The example is taken from pivot table explained.
The aggfunc
argument of pivot_table
takes a function or list of functions but not dict
aggfunc : function, default numpy.mean, or list of functions If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)
So try
n_page = (pd.pivot_table(Main_DF,
values='SPC_RAW_VALUE',
index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'],
columns=['LOT_VIRTUAL_LINE'],
aggfunc=[len, np.mean, np.std])
.reset_index()
)
You may want to rename the hierarchical columns afterwards.
Try using groupby
df = (Main_DF
.groupby(['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], as_index=False)
.LOT_VIRTUAL_LINE
.agg({'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
)
Setting as_index=False
just leaves these as columns in your dataframe so you don't have to reset the index afterwards.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With