Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to vectorize triple nested loops?

Tags:

r

nested-loops

I've done searching similar problems and I have a vague idea about what should I do: to vectorize everything or use apply() family. But I'm a beginner on R programming and both of the above methods are quite confusing.

Here is my source code:

x<-rlnorm(100,0,1.6)
j=0
k=0
i=0
h=0
lambda<-rep(0,200)
sum1<-rep(0,200)
constjk=0
wj=0
wk=0
for (h in 1:200)
{
   lambda[h]=2+h/12.5
   N=ceiling(lambda[h]*max(x))
   for (j in 0:N)
   {
      wj=(sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100
      for (k in 0:N)
      {
         constjk=dbinom(k, j + k, 0.5)
         wk=(sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100
         sum1[h]=sum1[h]+(lambda[h]/2)*constjk*wk*wj
      }
   }
}

Let me explain a bit. I want to collect 200 sum1 values (that's the first loop), and for every sum1 value, it is the summation of (lambda[h]/2)*constjk*wk*wj, thus the other two loops. Most tedious is that N changes with h, so I have no idea how to vectorize the j-loop and the k-loop. But of course I can vectorize the h-loop with lambda<-seq() and N<-ceiling(), and that's the best I can do. Is there a way to further simplify the code?

like image 764
Nicholas Humphrey Avatar asked Nov 09 '12 04:11

Nicholas Humphrey


1 Answers

Your code can be perfectly verctorized with 3 nested sapply calls. It might be a bit hard to read for the untrained eye, but the essence of it is that instead of adding one value at a time to sum1[h] we calculate all the terms produced by the innermost loop in one go and sum them up.

Although this vectorized solution is faster than your tripple for loop, the improvement is not dramatical. If you plan to use it many times I suggest you implement it in C or Fortran (with regular for loops), which improves the speed a lot. Beware though that it has high time complexity and will scale badly with increased values of lambda, ultimatelly reaching a point when it is not possible to compute within reasonable time regardless of the implementation.

lambda <- 2 + 1:200/12.5
sum1 <- sapply(lambda, function(l){
    N <- ceiling(l*max(x))
    sum(sapply(0:N, function(j){
        wj <- (sum(x <= (j+1)/l) - sum(x <= j/l))/100
        sum(sapply(0:N, function(k){
            constjk <- dbinom(k, j + k, 0.5)
            wk <- (sum(x <= (k+1)/l) - sum(x <= k/l))/100
            l/2*constjk*wk*wj
        }))
    }))
})

Btw, you don't need to predefine variables like h, j, k, wj and wk. Especially since not when vectorizing, as assignments to them inside the functions fed to sapply will create overlayered local variables with the same name (i.e. ignoring the ones you predefied).

like image 190
Backlin Avatar answered Oct 05 '22 22:10

Backlin