In dataframe have 4 columns col_A,col_B,col_C,col_D.Need to group the columns(col_A,col_B,col_C) and aggregate mean by col_D. Below is the code snippet I tried and it worked
df.groupby(['col_A','col_B','col_C']).agg({'col_D':'mean'}).reset_index()
But in addition to the above result, also require the group by count of ('col_A','col_B','col_C') along with aggregation. Any help on this please.
Using Named Aggregation:
result = (
df.groupby(['col_A', 'col_B', 'col_C'], as_index=False)
.agg(mean=('col_D', 'mean'), count=('col_D', 'count'))
)
For the count columns, you have 2 choices in choosing the aggregate function:
count=('col_D', 'count') will ignore any NaN value in col_Dcount=('col_D', 'size') will include NaN values in col_DIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With