I'm trying to speed up/vectorize some calculations in a time series. Can I vectorize a calculation in a for loop which can depend on results from an earlier iteration? For example:
z <- c(1,1,0,0,0,0) zi <- 2:6 for (i in zi) {z[i] <- ifelse (z[i-1]== 1, 1, 0) }
uses the z[i] values updated in earlier steps:
> z [1] 1 1 1 1 1 1
In my effort at vectorizing this
z <- c(1,1,0,0,0,0) z[zi] <- ifelse( z[zi-1] == 1, 1, 0)
the element-by-element operations don't use results updated in the operation:
> z [1] 1 1 1 0 0 0
So this vectorized operation operates in 'parallel' rather than iterative fashion. Is there a way I can write/vectorize this to get the results of the for loop?
ifelse
is vectorized and there's a bit of a penalty if you're using it on one element at a time in a for-loop. In your example, you can get a pretty good speedup by using if
instead of ifelse
.
fun1 <- function(z) { for(i in 2:NROW(z)) { z[i] <- ifelse(z[i-1]==1, 1, 0) } z } fun2 <- function(z) { for(i in 2:NROW(z)) { z[i] <- if(z[i-1]==1) 1 else 0 } z } z <- c(1,1,0,0,0,0) identical(fun1(z),fun2(z)) # [1] TRUE system.time(replicate(10000, fun1(z))) # user system elapsed # 1.13 0.00 1.32 system.time(replicate(10000, fun2(z))) # user system elapsed # 0.27 0.00 0.26
You can get some additional speed gains out of fun2
by compiling it.
library(compiler) cfun2 <- cmpfun(fun2) system.time(replicate(10000, cfun2(z))) # user system elapsed # 0.11 0.00 0.11
So there's a 10x speedup without vectorization. As others have said (and some have illustrated) there are ways to vectorize your example, but that may not translate to your actual problem. Hopefully this is general enough to be applicable.
The filter
function may be useful to you as well if you can figure out how to express your problem in terms of a autoregressive or moving average process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With