Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas groupBy multiple columns and aggregation

Tags:

python

pandas

In dataframe have 4 columns col_A,col_B,col_C,col_D.Need to group the columns(col_A,col_B,col_C) and aggregate mean by col_D. Below is the code snippet I tried and it worked

df.groupby(['col_A','col_B','col_C']).agg({'col_D':'mean'}).reset_index()

But in addition to the above result, also require the group by count of ('col_A','col_B','col_C') along with aggregation. Any help on this please.

like image 261
Prakash Avatar asked Nov 25 '25 07:11

Prakash


1 Answers

Using Named Aggregation:

result = (
    df.groupby(['col_A', 'col_B', 'col_C'], as_index=False)
      .agg(mean=('col_D', 'mean'), count=('col_D', 'count'))
)

For the count columns, you have 2 choices in choosing the aggregate function:

  • count=('col_D', 'count') will ignore any NaN value in col_D
  • count=('col_D', 'size') will include NaN values in col_D
like image 160
Code Different Avatar answered Nov 26 '25 22:11

Code Different



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!