I would like to get rolling average for each of the numeric variables that I have. Using data.table package, I know how to compute for a single variable. But how should I revise the code so it can process multiple variables at a time rather than revising the variable name and repeat this procedure for several times? Thanks.
Suppose I have other numeric variables named as "V2", "V3", and "V4".
require(data.table)
setDT(data)
setkey(data,Receptor,date)
data[ , `:=` ('RollConc' = rollmean(AvgConc, 48, align="left", na.pad=TRUE)) , by=Receptor]
A copy of my sample data can be found at: https://drive.google.com/file/d/0B86_a8ltyoL3OE9KTUstYmRRbFk/view?usp=sharing
I would like to get 5-hour rolling means for "AvgConc","TotDep","DryDep", and "WetDep" by each receptor.
Calculating rolling averages To calculate a simple moving average (over 7 days), we can use the rollmean() function from the zoo package. This function takes a k , which is an 'integer width of the rolling window. The code below calculates a 3, 5, 7, 15, and 21-day rolling average for the deaths from COVID in the US.
The 7-day moving average for any given day is calculated by taking the average sales of that day and the 6 previous days. The 14-day moving average is calculated by taking the average sales of the day in question and the previous 13 days. In R we can calculate this with a function called rollmean from the zoo package.
A rolling average continuously updates the average of a data set to include all the data in the set until that point. For example, the rolling average of return quantities at March 2012 would be calculated by adding the return quantities in January, February, and March, and then dividing that sum by three.
From your description you want something like this, which is similar to one example that can be found in one of the data.table vignettes:
library(data.table)
set.seed(42)
DT <- data.table(x = rnorm(10), y = rlnorm(10), z = runif(10), g = c("a", "b"), key = "g")
library(zoo)
DT[, paste0("ravg_", c("x", "y")) := lapply(.SD, rollmean, k = 3, na.pad = TRUE),
by = g, .SDcols = c("x", "y")]
Now, one can use the frollmean
function in the data.table
package for this.
library(data.table)
xy <- c("x", "y")
DT[, (xy):= lapply(.SD, frollmean, n = 3, fill = NA, align="center"),
by = g, .SDcols = xy]
Here, I am replacing the x and y columns by the rolling average.
# Data
set.seed(42)
DT <- data.table(x = rnorm(10), y = rlnorm(10), z = runif(10),
g = c("a", "b"), key = "g")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With