Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: Create cumulative average while grouping by other column

Tags:

python

pandas

Imagine a table like this:

name | value 
-----|------
Jack | 0    
Jack | 1
Jack | 0.5
Jack | 1
Jill | 0
Jill | 2

For every name, I'd like to have the cumulative average, like this:

name | value | cumAverage
-----|-------|-----------
Jack | 0     | 0
Jack | 1     | 0.5
Jack | 0.5   | 0.5
Jack | 1     | 0.625
Jill | 0     | 0
Jill | 2     | 1

So whenever a new name appears, the cumulative average should "restart". The name column is sorted, so whenever a new name appears the current cumulative average is finished.

like image 980
Nicolas Avatar asked Jun 13 '26 06:06

Nicolas


1 Answers

You need expanding().mean() with groupby:

df.groupby('name')['value'].expanding().mean().reset_index(0)

For Unsorted df the below will work:

df.groupby('name')['value'].expanding().mean().reset_index(0).sort_index()

   name  value
0  Jack  0.000
1  Jack  0.500
2  Jack  0.500
3  Jack  0.625
4  Jill  0.000
5  Jill  1.000
like image 156
anky Avatar answered Jun 15 '26 20:06

anky



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!