I have soccer results data in the following format (thousands of observations):
Div date value pts
1 E0 2011-08-13 Blackburn 0.0
2 E0 2011-08-13 Fulham 0.5
3 E0 2011-08-13 Liverpool 0.5
4 E0 2011-08-13 Newcastle 0.5
5 E0 2011-08-13 QPR 0.0
6 E0 2011-08-13 Wigan 0.5
7 E0 2011-08-14 Stoke 0.5
8 E0 2011-08-14 West Brom 0.0
9 E0 2011-08-15 Man City 1.0
10 E0 2011-08-20 Arsenal 0.0
11 E0 2011-08-20 Aston Villa 1.0
plus other variables. "value" is the team, pts is the final result (win/loss/draw) as a numerical value. I'm trying to add a new variable which is the average of this value over the last X games for the team in that row. How do I do this without using some horrible loop?
take a look at this
using the zoo package and rollmean and the plyr package's ddply:
library(zoo)
library(plyr)
dat <- data.frame(value=letters[1:5], pts=sample(c(0, 0.5, 1), 50, replace=T))
ddply(dat, .(value), summarise, rollmean(pts, k=5, align='right'))
however, as far as I understand a "rolling average" it shortens your data set by definition. you can supply a fill argument though:
ddply(dat, .(value), summarise, rollmean(pts, k=5, fill=NA, align='right'))
Try ave function from stats.
Trt <- gl(n=2, k=3, length=2*3, labels =c("A", "B"))
Y <- 1:6
Data <- data.frame(Trt, Y)
Data
Trt Y
1 A 1
2 A 2
3 A 3
4 B 4
5 B 5
6 B 6
Data$TrtMean <- ave(Y, Trt, FUN=mean)
Data
Trt Y TrtMean
1 A 1 2
2 A 2 2
3 A 3 2
4 B 4 5
5 B 5 5
6 B 6 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With