Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rolling average to multiple variables in R using data.table package

I would like to get rolling average for each of the numeric variables that I have. Using data.table package, I know how to compute for a single variable. But how should I revise the code so it can process multiple variables at a time rather than revising the variable name and repeat this procedure for several times? Thanks.

Suppose I have other numeric variables named as "V2", "V3", and "V4".

require(data.table)
setDT(data)
setkey(data,Receptor,date)
data[ , `:=` ('RollConc' = rollmean(AvgConc, 48, align="left", na.pad=TRUE)) , by=Receptor]

A copy of my sample data can be found at: https://drive.google.com/file/d/0B86_a8ltyoL3OE9KTUstYmRRbFk/view?usp=sharing

I would like to get 5-hour rolling means for "AvgConc","TotDep","DryDep", and "WetDep" by each receptor.

like image 602
Vicki1227 Avatar asked Jul 17 '15 18:07

Vicki1227


People also ask

How do you do a rolling average in R?

Calculating rolling averages To calculate a simple moving average (over 7 days), we can use the rollmean() function from the zoo package. This function takes a k , which is an 'integer width of the rolling window. The code below calculates a 3, 5, 7, 15, and 21-day rolling average for the deaths from COVID in the US.

How do you do a 7 day moving average in R?

The 7-day moving average for any given day is calculated by taking the average sales of that day and the 6 previous days. The 14-day moving average is calculated by taking the average sales of the day in question and the previous 13 days. In R we can calculate this with a function called rollmean from the zoo package.

How do you calculate rolling average?

A rolling average continuously updates the average of a data set to include all the data in the set until that point. For example, the rolling average of return quantities at March 2012 would be calculated by adding the return quantities in January, February, and March, and then dividing that sum by three.


2 Answers

From your description you want something like this, which is similar to one example that can be found in one of the data.table vignettes:

library(data.table)
set.seed(42)
DT <- data.table(x = rnorm(10), y = rlnorm(10), z = runif(10), g = c("a", "b"), key = "g")
library(zoo)
DT[, paste0("ravg_", c("x", "y")) := lapply(.SD, rollmean, k = 3, na.pad = TRUE), 
   by = g, .SDcols = c("x", "y")]
like image 94
Roland Avatar answered Oct 15 '22 18:10

Roland


Now, one can use the frollmean function in the data.table package for this.

library(data.table)    
xy <- c("x", "y")
DT[, (xy):= lapply(.SD, frollmean, n = 3, fill = NA, align="center"), 
                                   by = g, .SDcols =  xy]

Here, I am replacing the x and y columns by the rolling average.


# Data
set.seed(42)
DT <- data.table(x = rnorm(10), y = rlnorm(10), z = runif(10), 
                                g = c("a", "b"), key = "g")
like image 32
Suren Avatar answered Oct 15 '22 19:10

Suren