Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I take a rolling product using data.table

Tags:

r

data.table

dt <- data.table(x=c(1, .9, .8, .75, .5, .1))
dt
      x
1: 1.00
2: 0.90
3: 0.80
4: 0.75
5: 0.50
6: 0.10

For each row, how do I get the product of x for that row and the next two rows?

      x Prod.3
1: 1.00 0.7200
2: 0.90 0.5400
3: 0.80 0.3000
4: 0.75 0.0375
5: 0.50     NA
6: 0.10     NA

More generally, for each row, how do I get the product of x for that row and the next n rows?

like image 394
Ben Avatar asked Jun 02 '15 18:06

Ben


2 Answers

Here's another possible version using data.table::shift combined with Reduce (as per @Aruns comment)

library(data.table) #v1.9.6+
N <- 3L
dt[, Prod3 := Reduce(`*`, shift(x, 0L:(N - 1L), type = "lead"))]

shift is vectorized, meaning it can create several new columns at once depending on the vector passed to the n argument. Then, Reduce is basically applies * to all the vectors at once element-wise.

like image 61
David Arenburg Avatar answered Sep 23 '22 02:09

David Arenburg


Now data.table has fast rolling functions. So @Mamoun Benghezal 's approach can be used as

dt[, Prod.3 := frollapply(x, 3, FUN=prod, fill=NA, align='left')]

This is very fast, though not as fast as @David Arenburg 's function. Using @Arun 's benchmark:

set.seed(1L)
dt = data.table(x=runif(1e6))

froll_fun <- function(dt, N) {
    frollapply(dt$x, N, FUN = prod, fill = NA, align = 'left')
}

system.time(ans5 <- froll_fun(dt, 3L))
#  user  system elapsed 
# 0.191   0.000   0.191 
like image 24
James Hirschorn Avatar answered Sep 24 '22 02:09

James Hirschorn