Whenever I run large scale monte carlo simulations in S-Plus, I always end up growing a beard while I wait for it to complete.
What are the best tricks for running monte carlo simulations in R? Any good examples of running processes in a distributed fashion?
Using multiple cores/machines should be simple if you're just using parallel independent replications, but be aware of common deficiencies of random number generators (e.g. if using the current time as seed, spawning many processes with one RNG for each might produce correlated random numbers, which leads to invalid results - see e.g. this paper)
You might want to use variance reduction to reduce the number of required replications, i.e. to shrink the size of the required sample. More advanced variance reduction techniques can be found in many textbooks, e.g. in this one.
Preallocate your vectors!
> nsims <- 10000
> n <- 100
>
> system.time({
res <- NULL
for (i in 1:nsims) {
res <- c(res,mean(rnorm(n)))
}
})
user system elapsed
0.761 0.015 0.783
>
> system.time({
res <- rep(NA, nsims)
for (i in 1:nsims) {
res[i] <- mean(rnorm(n))
}
})
user system elapsed
0.485 0.001 0.488
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With