I have noticed a curious thing whilst working in R. When I have a simple program that computes squares from 1 to N implemented using for-loop and while-loop the behaviour is not the same. (I don't care about vectorisation in this case or apply functions).
fn1 <- function (N) { for(i in 1:N) { y <- i*i } }
AND
fn2 <- function (N) { i=1 while(i <= N) { y <- i*i i <- i + 1 } }
The results are:
system.time(fn1(60000)) user system elapsed 2.500 0.012 2.493 There were 50 or more warnings (use warnings() to see the first 50) Warning messages: 1: In i * i : NAs produced by integer overflow . . . system.time(fn2(60000)) user system elapsed 0.138 0.000 0.137
Now we know that for-loop is faster, my guess is because of pre allocation and optimisations there. But why does it overflow?
UPDATE: So now trying another way with vectors:
fn3 <- function (N) { i <- 1:N y <- i*i } system.time(fn3(60000)) user system elapsed 0.008 0.000 0.009 Warning message: In i * i : NAs produced by integer overflow
So Perhaps its a funky memory issue? I am running on OS X with 4Gb of memory and all default settings in R. This happens in 32- and 64-bit versions (except that times are faster).
Alex
for loops are fast. What you do inside the loop is slow (in comparison to vectorized operations). I would expect a while loop to be slower than a for loop since it needs to test a condition before each iteration. Keep in mind that R is an interpreted language, i.e., there are no compiler optimizations.
Use a for loop when you know the loop should execute n times. Use a while loop for reading a file into a variable. Use a while loop when asking for user input. Use a while loop when the increment value is nonstandard.
in general a while loop is used if you want an action to repeat itself until a certain condition is met i.e. if statement. An for loop is used when you want to iterate through an object.
Loops are used in programming to repeat a specific block of code. In this article, you will learn to create a while loop in R programming. In R programming, while loops are used to loop until a specific condition is met.
Because 1
is numeric, but not integer (i.e. it's a floating point number), and 1:6000
is numeric and integer.
> print(class(1)) [1] "numeric" > print(class(1:60000)) [1] "integer"
60000 squared is 3.6 billion, which is NOT representable in signed 32-bit integer, hence you get an overflow error:
> as.integer(60000)*as.integer(60000) [1] NA Warning message: In as.integer(60000) * as.integer(60000) : NAs produced by integer overflow
3.6 billion is easily representable in floating point, however:
> as.single(60000)*as.single(60000) [1] 3.6e+09
To fix your for
code, convert to a floating point representation:
function (N) { for(i in as.single(1:N)) { y <- i*i } }
The variable in the for loop is an integer sequence, and so eventually you do this:
> y=as.integer(60000)*as.integer(60000) Warning message: In as.integer(60000) * as.integer(60000) : NAs produced by integer overflow
whereas in the while loop you are creating a floating point number.
Its also the reason these things are different:
> seq(0,2,1) [1] 0 1 2 > seq(0,2) [1] 0 1 2
Don't believe me?
> identical(seq(0,2),seq(0,2,1)) [1] FALSE
because:
> is.integer(seq(0,2)) [1] TRUE > is.integer(seq(0,2,1)) [1] FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With