Is there a problem when accessing/writing to global variable in using doSNOW package on multiple cores?
In the below program, each of the MyCalculations(ii) writes to the ii-th column of the matrix "globalVariable"...
Do you think the result will be correct? Will there be hidden catches?
Thanks a lot!
p.s. I have to write out to the global variable because this is a simplied example, in fact I have lots of outputs that need to be transported from within the parallel loops... therefore, probably the only way is to write out to global variables...
library(doSNOW)
MaxSearchSpace=44*5
globalVariable=matrix(0, 10000, MaxSearchSpace)
cl<-makeCluster(7)
registerDoSNOW(cl)
foreach (ii = 2:nMaxSearchSpace, .combine=cbind, .verbose=F) %dopar%
{
MyCalculations(ii)
}
stopCluster(cl)
p.s. I am asking - within the DoSnow framework, is there any danger of accessing/writing global variables... thx
foreach. Description %do% and %dopar% are binary operators that operate on a foreach object and an R expression. The expression, ex, is evaluated multiple times in an environment that is created by the foreach. object, and that environment is modified for each evaluation as specified by the foreach object.
Overview. Global variables in R are variables created outside a given function. A global variable can also be used both inside and outside a function.
Since this question is a couple months old, I hope you've found an answer by now. However, in case you're still interested in feedback, here's something to consider:
When using foreach
with a parallel backend, you won't be able to assign to variables in R's global environment in the way you're attempting (you probably noticed this). Using a sequential backend, assignment will work, but not using a parallel one like with doSNOW
.
Instead, save all the results of your calculations for each iteration in a list and return this to an object, so that you can extract the appropriate results after all calculations have been completed.
My suggestion starts similarly to your example:
library(doSNOW)
MaxSearchSpace <- 44*5
cl <- makeCluster(parallel::detectCores())
# do not create the globalVariable object
registerDoSNOW(cl)
# Save the results of the `foreach` iterations as
# lists of lists in an object (`theRes`)
theRes <- foreach (ii = 2:MaxSearchSpace, .verbose=F) %dopar%
{
# do some calculations
theNorms <- rnorm(10000)
thePois <- rpois(10000, 2)
# store the results in a list
list(theNorms, thePois)
}
After all iterations have been completed, extract the results from theRes
and store them as objects (e.g., globalVariable
, globalVariable2
, etc.)
globalVariable1 <- do.call(cbind, lapply(theRes, "[[", 1))
globalVariable2 <- do.call(cbind, lapply(theRes, "[[", 2))
With this in mind, if you are performing calculations with each iteration that are dependent on the results of calculations from previous iterations, then this type of parallel computing is not the approach to take.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With