Pandas Pivot Table Dictionary of Agg function
I am trying to calculate 3 aggregative functions during pivoting:
This is the code:
n_page = (pd.pivot_table(Main_DF, 
                         values='SPC_RAW_VALUE',  
                         index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], 
                         columns=['LOT_VIRTUAL_LINE'],
                         aggfunc={'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
          .reset_index()
         )
Error I am getting is: KeyError: 'Mean'
How can I calculate those 3 functions?
aggfunc : It is an aggregation function and we can set this param with a list of functions, dict, default is numpy. mean. If it is set to a list of functions, the resulting pivot table forms a hierarchical column and this list of functions will be a top-level column.
Right-click on your pivot table and choose Refresh to make the duplicate values appear. Should you encounter this situation in the future, an easy fix is shown in Figure 5: In any version of Excel: Select column A, choose Data, Text to Columns, and then Finish.
Repeated labels are shown only when the PivotTable is in tabular form. They are not shown when compact form or outline form are applied. If you need to, you can change the format in Report layout.
As written in approved answer by @Happy001, aggfunc cant take dict is false. we can actually pass the dict to aggfunc.
A really handy feature is the ability to pass a dictionary to the aggfunc so you can perform different functions on each of the values you select.
for example:
import pandas as pd
import numpy as np
df = pd.read_excel('sales-funnel.xlsx')  #loading xlsx file
table = pd.pivot_table(df, index=['Manager', 'Status'], columns=['Product'], values=['Quantity','Price'],
           aggfunc={'Quantity':len,'Price':[np.sum, np.mean]},fill_value=0)
table
In the above code, I am passing dictionary to the aggfunc and performing len operation on Quantity and mean, sum operations on Price.
Here is the output attaching:

The example is taken from pivot table explained.
The aggfunc argument of pivot_table takes a function or list of functions but not dict
aggfunc : function, default numpy.mean, or list of functions If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)
So try
n_page = (pd.pivot_table(Main_DF, 
                         values='SPC_RAW_VALUE',  
                         index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], 
                         columns=['LOT_VIRTUAL_LINE'],
                         aggfunc=[len, np.mean, np.std])
          .reset_index()
         )
You may want to rename the hierarchical columns afterwards.
Try using groupby
df = (Main_DF
      .groupby(['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], as_index=False)
      .LOT_VIRTUAL_LINE
      .agg({'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
     )
Setting as_index=False just leaves these as columns in your dataframe so you don't have to reset the index afterwards.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With