I was doing some parallel simulations in R and I notice that the seed is not changed when the "L'Ecuyer-CMRG" rng is used. I was reading the book "Parallel R", and the option mc.set.seed = TRUE should give each worker a new seed each time mclapply() is called.
Here is my code:
library(parallel)
RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037
[[2]]
[1] -0.9013552 0.3445190
mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037
[[2]]
[1] -0.9013552 0.3445190
EDIT: same thing happens both on my desktop and on my laptop (both Ubuntu 12.04 LTS).
It appears to me that if you want to guarantee that subsequent calls to mclapply in an R session get different random numbers, you need to either call set.seed with a different value, remove the global variable ".Random.seed", or generate at least one random number in that R session before calling mclapply again.
The reason for this behavior is that mclapply (unlike mcparallel for example) calls mc.reset.stream internally. This resets the seed that is stashed in the "parallel" package to the value of ".Random.seed", so if ".Random.seed" hasn't changed when mclapply is called again, the workers started by mclapply will get the same random numbers as they did previously.
Note that this is not the case with functions such as clusterApply and parLapply, since they use persistent workers, and therefore continue to draw random numbers from their RNG stream. But new workers are forked every time mclapply is called, presumably making it much harder to have that type of behavior.
Here's an example of setting the seed to different values in order to get different random numbers using mclapply:
RNGkind("L'Ecuyer-CMRG")
set.seed(100)
mclapply(1:2, function(i) rnorm(2))
set.seed(101)
mclapply(1:2, function(i) rnorm(2))
Here's an example of removing ".Random.seed":
RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rm(.Random.seed)
mclapply(1:2, function(i) rnorm(2))
And here's an example of generating random numbers on the master:
RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rnorm(1)
mclapply(1:2, function(i) rnorm(2))
I'm not sure which is the best approach, but that may depend on what you're trying to do.
Although it appears that simply calling mclapply multiple times without changing ".Random.seed" results in reproducible results, I don't know if that is guaranteed. To guarantee reproducible results, I think you need to call set.seed:
RNGkind("L'Ecuyer-CMRG")
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))
Have you considered using clusterApply
instead of mclapply
?
I think it's easier to guarantee reproducible examples using this framework. I'm attaching an example.
library(parallel)
#---- creating local cluster ----
clust <- makeCluster(detectCores())
#---- seed ----
RNGkind(kind = "L'Ecuyer-CMRG")
set.seed(1234)
s <- .Random.seed
clusterSetRNGStream(cl = clust, iseed = s)
#---- generating random numbers ----
clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -0.0514128 0.7344781
#>
#> [[2]]
#> [1] 0.3946233 -0.6649782
clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -1.57875323 -0.07533176
#>
#> [[2]]
#> [1] -0.3698359 -0.1564795
#---- repeating the process ----
set.seed(1234)
s <- .Random.seed
clusterSetRNGStream(cl = clust, iseed = s)
clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -0.0514128 0.7344781
#>
#> [[2]]
#> [1] 0.3946233 -0.6649782
clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -1.57875323 -0.07533176
#>
#> [[2]]
#> [1] -0.3698359 -0.1564795
# same result
Created on 2019-03-04 by the reprex package (v0.2.1)
devtools::session_info()
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.5.1 (2018-07-02)
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> tz America/Sao_Paulo
#> date 2019-03-04
#> Packages -----------------------------------------------------------------
#> package * version date source
#> backports 1.1.2 2017-12-13 CRAN (R 3.5.0)
#> base * 3.5.1 2018-07-03 local
#> compiler 3.5.1 2018-07-03 local
#> datasets * 3.5.1 2018-07-03 local
#> devtools 1.13.6 2018-06-27 CRAN (R 3.5.0)
#> digest 0.6.17 2018-09-12 cran (@0.6.17)
#> evaluate 0.11 2018-07-17 cran (@0.11)
#> graphics * 3.5.1 2018-07-03 local
#> grDevices * 3.5.1 2018-07-03 local
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.5.0)
#> knitr 1.20 2018-02-20 CRAN (R 3.5.0)
#> magrittr 1.5 2014-11-22 cran (@1.5)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.5.0)
#> methods * 3.5.1 2018-07-03 local
#> parallel * 3.5.1 2018-07-03 local
#> Rcpp 0.12.18 2018-07-23 cran (@0.12.18)
#> rmarkdown 1.10 2018-06-11 CRAN (R 3.5.1)
#> rprojroot 1.3-2 2018-01-03 CRAN (R 3.5.0)
#> stats * 3.5.1 2018-07-03 local
#> stringi 1.2.4 2018-07-20 cran (@1.2.4)
#> stringr 1.3.1 2018-05-10 CRAN (R 3.5.0)
#> tools 3.5.1 2018-07-03 local
#> utils * 3.5.1 2018-07-03 local
#> withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
#> yaml 2.2.0 2018-07-25 cran (@2.2.0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With