Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas groupby last n

What is the best way to get the mean of the last n instances using pandas groupby?

For example I have a dataframe like this:

frame = pd.DataFrame({'Student' : ['Bob', 'Bill', 'Bob', 'Bob', 'Bill', 'Joe', 'Joe', 'Bill', 'Bob', 'Joe'],                                      
                          'Score' : np.random.random(10)})

how do I get the mean of the last 3 scores for each student.

like image 377
user2333196 Avatar asked Mar 30 '14 21:03

user2333196


1 Answers

Maybe something like this?

>>> df.groupby("Student")["Score"].apply(lambda x: x.iloc[-3:].mean())
Student
Bill       0.513128
Bob        0.342806
Joe        0.469662
Name: Score, dtype: float64

You can access the last three (or fewer) elements using .iloc[-3:], and then take the mean using .mean().

Alternatively, you could use .tail(3) instead, or do it in two passes:

>>> df.groupby("Student").tail(3).groupby("Student")["Score"].mean()
Student
Bill       0.513128
Bob        0.342806
Joe        0.469662
Name: Score, dtype: float64
like image 156
DSM Avatar answered Sep 25 '22 00:09

DSM