Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a more efficient method than while loops for something that requires conditional checking?

Tags:

loops

r

I have a problem that involves me wrapping a while loop around a bit of code that I believe can be vectorized efficiently. However, at each step, my stopping condition relies on the value at that stage. Consider this example as a representational model of my problem:
Generate N(0,1) random variables using rnorm()until you sample a value greater than an arbitrary value, k.

EDIT: A caveat of my problem, discussed in the comments, is that I cannot know, a priori, a good approximation of how many samples to take before my stopping condition.

One approach:

  1. Using a while-loop, sample suitably sized normal random vectors (for instance, rnorm(50) to sample 50 standard normals at a time, or rnorm(1) if k is close to zero). Check this vector to see if any observations are greater than k.

  2. If yes, stop and return all preceding values. Otherwise, combine your vector from step 1 with a new vector you make by repeating step 1.

Another approach would be to specify a completely overkill number of random draws for that given k. This might mean if k=2, sample 1,000 normal random variables using rnorm(1000).

Leveraging the vectorization that R offers in the second case gives faster results than the loop version in cases where the overkill number is not too much larger than necessary, but in my problem, I don't have a good intuition for how many runs I need to do, so I'd need to be conservative.

The question follows: Is there a way to do a highly-vectorized procedure, like method 2, but using conditional checking like method 1? Is doing small vectorized operations like rnorm(50) the "fastest" way, when considering that the highly-vectorized method is element-per-element faster, but more wasteful?

like image 997
Christopher Aden Avatar asked Apr 20 '12 18:04

Christopher Aden


People also ask

What is more efficient than while loop?

It turns out that Repeat is actually quite a bit more efficient than While, demonstrated below. Repeat may have the convenience that in many situations, the condition is not known or even defined until inside the loop.

Is there any case where while loop works better than do while loop?

Simply, when you want to check condition before and then perform operation while is better option, and if you want to perform operation at least once and then check the condition do-while is better.

What can I use instead of a while loop?

All for loops can be written as while loops, and vice-versa. Just use whichever loop seems more appropriate to the task at hand. In general, you should use a for loop when you know how many times the loop should run.

Why is a for loop more efficient than a while loop?

The for places the initial condition, increment, and exit condition all in one place, making it easier to understand. The while loop spreads them around. For example, in your sample, what is the initial value of i? -oh, you forgot to specify it? --that's the point.


1 Answers

Here is an implementation of my earlier suggestion: use your first approach but increase the number of new samples between each iteration, e.g., instead of 50 new samples at each iteration, multiply that number by two between each iteration: 50, then 100, 200, 400, etc.

With your sample size following a divergent geometric series, you are guaranteed to exit in a "few" iterations.

sample.until.thresh <- function(FUN, exit.thresh,
                                sample.start = 50,
                                sample.growth = 2) {

   sample.size    <- sample.start
   all.values     <- list()
   num.iterations <- 0L

   repeat {
      num.iterations <- num.iterations + 1L
      sample.values  <- FUN(sample.size)
      all.values[[num.iterations]] <- sample.values

      above.thresh <- sample.values > exit.thresh
      if (any(above.thresh)) {
         first.above <- match(TRUE, above.thresh)
         all.values[[num.iterations]] <- sample.values[1:first.above]
         break
      }

      sample.size <- sample.size * sample.growth
   }

   all.values <- unlist(all.values)

   return(list(num.iterations = num.iterations,
               sample.size    = length(all.values),
               sample.values  = all.values))
}

set.seed(123456L)
res <- sample.until.thresh(rnorm, 5)
res$num.iterations
# [1] 16
res$sample.size
# [1] 2747703
like image 188
flodel Avatar answered Oct 05 '22 01:10

flodel