When using clusters in R on Windows I been trying to find a simple way to transfer results from a cluster to the master. If the results is an array or a simple number the .combine option of foreach / %dopar% statement takes care of this, but if the result is a complex object lets such a randomForest model, how to transfer the whole model from the slave cluster back to the master?
I try: assing with env=.Global but it does not work on my Windows 7.
At the end I work around by saving the object to file. Then the master can recover the object. If someone knows a more elegant way or why assing does not work I appreciate comments.
sample code:
print(" paralelize with 8 cores ------------------------------")
library(doSNOW)
cl<-makeCluster(8)
registerDoSNOW(cl)
clusterEvalQ(cl, library(randomForest))
clusterExport(cl, "x")
clusterExport(cl, "y")
clusterExport(cl, "x.selected")
makeModel <- function(i){
m <- randomForest(x,x.selected[i,],mtry=250,sampsize=3200,ntree = 3000,do.trace=TRUE)
eval(parse(text = paste("model_",i," <- m",sep="")))
eval(parse( text =paste("save(model_", i, ", file =\"model_", i, ".Rdata\")",sep="" ) ))
}
foreach(i = 1:length(x.selected[,1]),.verbose = TRUE ) %dopar% makeModel(i)
stopCluster(cl)
foreach(i = 1:length(x.selected[,1]),.verbose = TRUE ) %do%
load(paste("model_",i,".RData",sep=""))
If you don't specify a .combine function, foreach will return a list in order to handle arbitrary objects just like the clusterApply function. Many foreach examples use .combine="c", but that won't work with randomForest model objects. If the body of the foreach loop evaluates to the randomForest model object, foreach will return a list of those objects.
Here is a simplified version of the randomForest example from the foreach package that returns model objects in a list and combines them afterwards. I also modified it to use the doSNOW package as in your example:
library(doSNOW)
library(randomForest)
cl <- makeCluster(8)
registerDoSNOW(cl)
nr <- 1000
x <- matrix(runif(100000), nr)
y <- gl(2, nr/2)
rf <- foreach(ntree=rep(125, 8), .packages='randomForest') %dopar% {
randomForest(x, y, ntree=ntree)
}
crf <- do.call('combine', rf)
print(crf)
stopCluster(cl)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With