Pandas

Question

I'm trying to get an expanding mean. I can get it to work when I iterate and "group" just by filtering by the specific values, but it takes way too long to do. I feel like this should be an easy application to do with a groupby, but when I do it, it just does the expanding mean to the entire dataset, as opposed to just doing it for each of the groups in grouby.

for a quick example:

I want to take this (in this particular case, grouped by 'player' and 'year'), and get an expanding mean.

player  pos year    wk  pa  ra
a       qb  2001    1   10  0       
a       qb  2001    2   5   0
a       qb  2001    3   10  0
a       qb  2002    1   12  0
a       qb  2002    2   13  0
b       rb  2001    1   0   20
b       rb  2001    2   0   17
b       rb  2001    3   0   12
b       rb  2002    1   0   14
b       rb  2002    2   0   15

to get:

player  pos year    wk  pa  ra  avg_pa  avg_ra
a       qb  2001    1   10  0   10      0
a       qb  2001    2   5   0   7.5     0
a       qb  2001    3   10  0   8.3     0
a       qb  2002    1   12  0   12      0
a       qb  2002    2   13  0   12.5    0
b       rb  2001    1   0   20  0       20
b       rb  2001    2   0   17  0       18.5
b       rb  2001    3   0   12  0       16.3
b       rb  2002    1   0   14  0       14
b       rb  2002    2   0   15  0       14.5

Not sure where I'm going wrong:

# Group by player and season - also put weeks in correct ascending order
grouped = calc_averages.groupby(['player','pos','seas']).apply(pd.DataFrame.sort_values, 'wk')


grouped['avg_pa'] = grouped['pa'].expanding().mean()

But this will give an expanding mean for the entire set, not for each player, season.

Scott Boston · Accepted Answer

Try:

df.sort_values('wk').groupby(['player','pos','year'])['pa','ra'].expanding().mean()\
  .reset_index()

Output:

  player pos  year  level_3         pa         ra
0      a  qb  2001        0  10.000000   0.000000
1      a  qb  2001        1   7.500000   0.000000
2      a  qb  2001        2   8.333333   0.000000
3      a  qb  2002        3  12.000000   0.000000
4      a  qb  2002        4  12.500000   0.000000
5      b  rb  2001        5   0.000000  20.000000
6      b  rb  2001        6   0.000000  18.500000
7      b  rb  2001        7   0.000000  16.333333
8      b  rb  2002        8   0.000000  14.000000
9      b  rb  2002        9   0.000000  14.500000

Pandas - expanding mean with groupby

Tags:

pandas-groupby

chitown88

1 Answers

Scott Boston

Recent Activity

Donate For Us