I am trying to calculate a rolling mean of value grouped by multiple dimensions in R. Something I would do in SQL in the following way:
AVG(value) OVER
(PARTITION BY dim1, dim2 ORDER BY date
RANGE BETWEEN 5 PRECEDING AND CURRENT ROW)
The following seems to work if I select just a few dimensions:
s <- ave(df$value,
list(df$dim1, df$dim2),
FUN= function(x) rollapply(x, 5, mean, align='right'))
but gives the following error when I select full set of dimensions:
Error: k <= n is not TRUE
I get the same error when I run:
rollapply(c(1:2), 3, mean, align='right')
so I guess the issue is that some combinations of dimensions do not have enough values to calculate mean.
How could I overcome it? I am fine with having a NA as a result for those combinations. Any help would be much appreciated..
roll_meanr
from the RcppRoll
package will do this by default:
library(RcppRoll)
> roll_meanr(c(1:2), 3)
# [1] NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With