pandas groupby aggregate customised function with multiple columns

Question

I am trying to use a customised function with groupby in pandas. I find that using apply allows me to do that in the following way:

(An example which calculates a new mean from two groups)

import pandas as pd

def newAvg(x):
    x['cm'] = x['count']*x['mean']
    sCount = x['count'].sum()
    sMean = x['cm'].sum()
    return sMean/sCount

data = [['A', 4, 2.5], ['A', 3, 6], ['B', 4, 9.5], ['B', 3, 13]]
df = pd.DataFrame(data, columns=['pool', 'count', 'mean'])

df_gb = df.groupby(['pool']).apply(newAvg)

Is it possible to integrate this into an agg function? Along these lines:

df.groupby(['pool']).agg({'count': sum, ['count', 'mean']: apply(newAvg)})

BENY · Accepted Answer

IIUC

df.groupby(['pool']).apply(lambda x : pd.Series({'count':sum(x['count']),'newavg':newAvg(x)}))
Out[58]: 
      count  newavg
pool               
A       7.0     4.0
B       7.0    11.0

pandas groupby aggregate customised function with multiple columns

Tags:

python

pandas

aggregate

pandas-groupby

Christian

1 Answers

BENY

Recent Activity

Donate For Us

pandas groupby aggregate customised function with multiple columns

Tags:

python

pandas

aggregate

pandas-groupby

Christian

1 Answers

BENY

Related questions

Recent Activity

Donate For Us