Imagine we are doing a number of processes where i want to set one overall seed at the beginning of a program: e.g.
mylist <- list( as.list(rep(NA,3)), as.list(rep(NA,3)) )
foo <- function(x){ for(i in 1:length(x)){
x[[i]] <- sample(100,1)
}
return(x)
}
# start block
set.seed(1)
l1 <- lapply(mylist, foo)
l2 <- lapply(mylist, foo)
# end
of course within a block l1
and l2
are different, but if i run the above block again l1
will be the same as before and l2
will be the same as before.
Imagine foo
is horribly time consuming so i want to use mclapply
not lapply
, so i do:
library(parallel)
# start block
set.seed(1)
mclapply(mylist , foo, mc.cores = 3)
mclapply(mylist , foo, mc.cores = 3)
# end
If i run this block again I will get different results the next time. How do I produce the same behaviour as with setting one overall seed with lapply
but using mclappy
. I have looked through mclapply
doc but I am not sure because using:
set.seed(1)
l1 <- mclapply(mylist , foo, mc.cores = 3, mc.set.seed=FALSE)
l2 <- mclapply(mylist , foo, mc.cores = 3, mc.set.seed=FALSE)
result in l1
and l2
being the same, which is not what I want...
The parallel
package comes with special support for the "L'Ecuyer-CMRG" random number generator which was introduced at the same time as parallel
. You can read the documentation for that support using:
library(parallel)
?mc.reset.stream
To use it, you first need to enable "L'Ecuyer-CMRG":
RNGkind("L'Ecuyer-CMRG")
After doing that, code such as:
set.seed(1)
mclapply(mylist, foo, mc.cores=3)
mclapply(mylist, foo, mc.cores=3)
will be reproducible, but the two calls to mclapply
will return identical results. This is because the state of the random number generator in the master process isn't changed by calling mclapply
.
I've used the following function to skip over the random number streams used by the mclapply
workers:
skip.streams <- function(n) {
x <- .Random.seed
for (i in seq_len(n))
x <- nextRNGStream(x)
assign('.Random.seed', x, pos=.GlobalEnv)
}
You can use this function to get the behavior that I think you want:
set.seed(1)
mclapply(mylist, foo, mc.cores=3)
skip.streams(3)
mclapply(mylist, foo, mc.cores=3)
skip.streams(3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With