Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - calculating the average value of a dataframe column from the top row to bottom row

Tags:

r

The title may not be that clear, since it was difficult to summarize the problem in a few words, although I don't think the problem is that difficult to solve. To explain the problem, let me share a dataframe for reference:

head(df, n = 10)

     team     score 
1       A        10       
2       A         4        
3       A        10        
4       A        16        
5       A        20        
6       B         5
7       B        11         
8       B         8    
9       B        16         
10      B         5       

I'd like to add a third column, that calculates the average score for each team, with the average score updating as I go down the rows for each team, and then resetting at a new team. For example, the output column I am hoping for would look like this:

head(df, n = 10)

     team     score   avg_score
1       A        10          10 
2       A         4           7
3       A        10           8
4       A        16          10
5       A        20          12
6       B         5           5
7       B        11           8 
8       B         8           8
9       B        16          10 
10      B         5           9


# row1: 10 = 10  
# row2: 7 = (10 + 4)/2  
# row3: 8 = (10 + 4 + 10)/3   
# ... 

with the pattern following, and the calculation restarting for a new team.

Thanks,

like image 277
Canovice Avatar asked Aug 23 '16 21:08

Canovice


People also ask

How do you calculate the average of a column in a DataFrame?

To get column average or mean from pandas DataFrame use either mean() and describe() method. The DataFrame. mean() method is used to return the mean of the values for the requested axis.

How do I get the average of each row in R?

rowMeans() function in R Language is used to find out the mean of each row of a data frame, matrix, or array.

How do you calculate the average of a row in a DataFrame?

The row average can be found using DataFrame. mean() function. It returns the mean of the values over the requested axis.

How do you find the average of a DataFrame in R?

In this method of computing, the mean of the given dataframe column user just need to call the colMeans function which is an in-build function in R language and pass the dataframe as its parameter, then this will be returning the mean of all the column in the provided dataframe to the user.


1 Answers

library("data.table")
setDT(df)[, `:=` (avg_score = cumsum(score)/1:.N), by = team]

or more readable as per the comment by @snoram

setDT(dt)[, avg_score := cumsum(score)/(1:.N), by = team]

#    team score avg_score
# 1:    A    10        10
# 2:    A     4         7
# 3:    A    10         8
# 4:    A    16        10
# 5:    A    20        12
# 6:    B     5         5
# 7:    B    11         8
# 8:    B     8         8
# 9:    B    16        10
# 10:    B     5         9
like image 68
Sathish Avatar answered Sep 21 '22 10:09

Sathish