Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: using multiple functions in a group by

My data has ages, and also payments per month.

I'm trying to aggregate summing the payments, but without summing the ages (averaging would work).

Is it possible to use different functions for different columns?

like image 925
sapo_cosmico Avatar asked Mar 17 '23 17:03

sapo_cosmico


1 Answers

You can pass a dictionary to agg with column names as keys and the functions you want as values.

import pandas as pd
import numpy as np

# Create some randomised data
N = 20
date_range = pd.date_range('01/01/2015', periods=N, freq='W')
df = pd.DataFrame({'ages':np.arange(N), 'payments':np.arange(N)*10}, index=date_range)

print(df.head())
#             ages  payments
# 2015-01-04     0         0
# 2015-01-11     1        10
# 2015-01-18     2        20
# 2015-01-25     3        30
# 2015-02-01     4        40

# Apply np.mean to the ages column and np.sum to the payments.
agg_funcs = {'ages':np.mean, 'payments':np.sum}

# Groupby each individual month and then apply the funcs in agg_funcs
grouped = df.groupby(df.index.to_period('M')).agg(agg_funcs)

print(grouped)
#          ages  payments
# 2015-01   1.5        60
# 2015-02   5.5       220
# 2015-03  10.0       500
# 2015-04  14.5       580
# 2015-05  18.0       540
like image 89
Ffisegydd Avatar answered Mar 25 '23 09:03

Ffisegydd