Suppose I want perform a simulation using the following function
:
fn1 <- function(N) {
res <- c()
for (i in 1:N) {
x <- rnorm(2)
res <- c(res, x[2]-x[1])
}
res
}
For very large N
, computation appears to hang. Are there better ways of doing this?
(Inspired by: https://stat.ethz.ch/pipermail/r-help/2008-February/155591.html)
Loops are slower in R than in C++ because R is an interpreted language (not compiled), even if now there is just-in-time (JIT) compilation in R (>= 3.4) that makes R loops faster (yet, still not as fast). Then, R loops are not that bad if you don't use too many iterations (let's say not more than 100,000 iterations).
A FOR loop is the most intuitive way to apply an operation to a series by looping through each item one by one, which makes perfect sense logically but should be avoided by useRs given the low efficiency.
The R Break statement is very useful to exit from any loop such as For, While, and Repeat. While executing these, if R finds the break statement inside them, it will stop executing the code and immediately exit from the loop.
The efficiency of loops can be increased tremendously in R through the use of the apply functions which essentially process whole vectors of data at once rather than looping through them. For the loop shown above, there are two basic operations happening during each iteration:
# A vector of two random numbers is generated
x <- rnorm( 2 )
# The difference between those numbers is calculated
x[2] - x[1]
In this case the appropriate function would be sapply()
. sapply()
operates on a list of objects, such as the vector generated by the loop statement 1:N
and returns a vector of results:
sapply( 1:N, function( i ){ x <- rnorm(2); return( x[2] - x[1] ) } )
Note that the index value i
is available during the function call and successively takes on the values between 1
and N
, however it is not needed in this case.
Getting into the habit of recognizing where apply
can be used over for
is a very valuable skill- many R libraries for parallel computation provide plug-and-play parallelization through apply
functions. Using apply
can often allow access to significant performance increases on multicore systems with zero refactoring of code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With