Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas group by multiple custom aggregate function on multiple columns

Given data:

grp data1 data2 data3
a 2 1 2
a 4 6 3
b 3 2 1
b 7 3 5

Expected output:

grp sum(data1) sum(data2)/sum(data1) sum(data3)/sum(data1)
a 6 1.166666667 0.83
a 10 0.5 0.6

Assume custom aggregation can be dependent on multiple columns and not always a simple division operation. I know using SQL query it's possible, but I am interested in an answer with apply and aggregate function if possible.

like image 469
Parshant garg Avatar asked Jan 25 '26 14:01

Parshant garg


1 Answers

You can use groupby + assign here to generate required aggregations. You can apply whatever aggregate function is needed.

g = df.groupby('grp')
#                                         for custom agg func use .agg(custom_agg_func)
#                                                          ^^^^^
g[['data1']].agg('sum').assign(sum2 = lambda df: g['data2'].sum()/df['data1'],
                               sum3 = lambda df: g['data3'].sum()/df['data1'])
#                ^^^^^^
#    you can use custom agg func of your choice

     data1      sum2      sum3
grp                           
a        6  1.166667  0.833333
b       10  0.500000  0.600000
like image 76
Ch3steR Avatar answered Jan 28 '26 04:01

Ch3steR



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!