R - calculating the average value of a dataframe column from the top row to bottom row

Tags:

r

The title may not be that clear, since it was difficult to summarize the problem in a few words, although I don't think the problem is that difficult to solve. To explain the problem, let me share a dataframe for reference:

head(df, n = 10)

     team     score 
1       A        10       
2       A         4        
3       A        10        
4       A        16        
5       A        20        
6       B         5
7       B        11         
8       B         8    
9       B        16         
10      B         5

I'd like to add a third column, that calculates the average score for each team, with the average score updating as I go down the rows for each team, and then resetting at a new team. For example, the output column I am hoping for would look like this:

head(df, n = 10)

     team     score   avg_score
1       A        10          10 
2       A         4           7
3       A        10           8
4       A        16          10
5       A        20          12
6       B         5           5
7       B        11           8 
8       B         8           8
9       B        16          10 
10      B         5           9


# row1: 10 = 10  
# row2: 7 = (10 + 4)/2  
# row3: 8 = (10 + 4 + 10)/3   
# ...

with the pattern following, and the calculation restarting for a new team.

Thanks,

277

asked Aug 23 '16 21:08

Canovice

1 Answers

library("data.table")
setDT(df)[, `:=` (avg_score = cumsum(score)/1:.N), by = team]

or more readable as per the comment by @snoram

setDT(dt)[, avg_score := cumsum(score)/(1:.N), by = team]

#    team score avg_score
# 1:    A    10        10
# 2:    A     4         7
# 3:    A    10         8
# 4:    A    16        10
# 5:    A    20        12
# 6:    B     5         5
# 7:    B    11         8
# 8:    B     8         8
# 9:    B    16        10
# 10:    B     5         9

answered Sep 21 '22 10:09

Sathish

Related questions
                            
                                Convert various dummy/logical variables into a single categorical variable/factor from their name in R
                            
                                Is it possible to "clear" the brushed area of a plot in shiny?
                            
                                How to replace data.frame column names with string in corresponding lookup table in R
                            
                                How to use logical functions with %>% operator (dplyr)
                            
                                Create new column based on two other columns, but average when observed in both
                            
                                Can rvest keep inline html tags such as <br> using html_table?
                            
                                Unable to convert data frame to h2o object
                            
                                Integrating time series graphs and leaflet maps using R shiny
                            
                                Trouble with packrat corrupting R functioning
                            
                                R and Julia Kernels not available in Jupyter notebook
                            
                                Modifying dots (...) inside a function
                            
                                ggplot2 shading envelope of time series
                            
                                R transform a factor ID variable into a numeric ID variable
                            
                                Count matching elements by row between two data tables in R
                            
                                Tooltip in shiny UI for help text
                            
                                How to standardize a data frame which contains both numeric and factor variables
                            
                                Directlabels package-- labels do not fit in plot area
                            
                                R data.table Multiple Conditions Join
                            
                                Installing nloptr on Linux - fatal error: nlopt.h: No such file or directory
                            
                                R - Determine if a variable is a string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With