Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoid two for loops in R

Tags:

loops

r

I have a R code that can do convolution of two functions...

convolveSlow <- function(x, y) {  
nx <- length(x); ny <- length(y)  
xy <- numeric(nx + ny - 1)  
for(i in seq(length = nx)) {  
 xi <- x[[i]]  
        for(j in seq(length = ny)) {  
            ij <- i+j-1  
            xy[[ij]] <- xy[[ij]] + xi * y[[j]]  
        }  
    }  
    xy  
}  

Is there a way to remove the two for loops and make the code run faster?

Thank you San

like image 353
user602599 Avatar asked Feb 04 '11 04:02

user602599


People also ask

Should you avoid for loops in R?

For that reason, it might make sense for you to avoid for-loops and to use functions such as lapply instead. This might speed up the R syntax and can save a lot of computational power! The next example explains how to use the lapply function in R.

How do I remove a loop in R?

Just like with repeat and while loops, you can break out of a for loop completely by using the break statement. Additionally, if you just want to skip the current iteration, and continue the loop, you can use the next statement.


2 Answers

Since R is very fast at computing vector operations, the most important thing to keep in mind when programming for performance is to vectorise as many of your operations as possible.

This means thinking hard about replacing loops with vector operations. Here is my solution for fast convolution (50 times faster with input vectors of length 1000 each):

convolveFast <- function(x, y) {
    nx <- length(x)
    ny <- length(y)
    xy <- nx + ny - 1
    xy <- rep(0, xy)
    for(i in (1:nx)){
        j <- 1:ny
        ij <- i + j - 1
        xy[i+(1:ny)-1] <- xy[ij] + x[i] * y
    }
    xy
}

You will notice that the inner loop (for j in ...) has disappeared. Instead, I replaced it with a vector operation. j is now defined as a vector (j <- 1:ny). Notice also that I refer to the entire vector y, rather than subsetting it (i.e. y instead of y[j]).

j <- 1:ny
ij <- i + j - 1
xy[i+(1:ny)-1] <- xy[ij] + x[i] * y

I wrote a small function to measure peformance:

measure.time <- function(fun1, fun2, ...){
    ptm <- proc.time()
    x1 <- fun1(...)
    time1 <- proc.time() - ptm

    ptm <- proc.time()
    x2 <- fun2(...)
    time2 <- proc.time() - ptm

    ident <- all(x1==x2)

    cat("Function 1\n")
    cat(time1)
    cat("\n\nFunction 2\n")
    cat(time2)
    if(ident) cat("\n\nFunctions return identical results")

}

For two vectors of length 1000 each, I get a 98% performance improvement:

x <- runif(1000)
y <- runif(1000)
measure.time(convolveSlow, convolveFast, x, y)

Function 1
7.07 0 7.59 NA NA

Function 2
0.14 0 0.16 NA NA

Functions return identical results
like image 88
Andrie Avatar answered Nov 11 '22 09:11

Andrie


  1. For vectors, you index with [], not [[]], so use xy[ij] etc

  2. Convolution doesn't vectorise easily but one common trick is to switch to compiled code. The Writing R Extensions manual uses convolution as a running example and shows several alternative; we also use it a lot in the Rcpp documentation.

like image 30
Dirk Eddelbuettel Avatar answered Nov 11 '22 08:11

Dirk Eddelbuettel