Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

error: object '.doSnowGlobals' not found?

I'm trying to parallelize a code on 4 nodes(type = "SOCK"). Here is my code.

library(itertools)
library(foreach)
library(doParallel)
library(parallel)

workers <- ip address of 4 nodes
cl = makePSOCKcluster(workers, master="ip address of master")
registerDoParallel(cl)

z <- read.csv("ProcessedData.csv", header=TRUE, as.is=TRUE)
z <- as.matrix(z)


system.time({
  chunks <- getDoParWorkers()
  b <- foreach (these = isplitIndices(nrow(z),
                                      chunks=chunks),
                .combine = c) %dopar% {
                  a <- rep(0, length(these))
                  for (i in 1:length(these)) {
                    a[i] <- mean(z[these[i],])
                  }
                  a
                }
})

I get this error:

4 nodes produced errors; first error: object '.doSnowGlobals' not found.

This code runs fine if I'm using doMC i.e using the same machine's cores. But when I try to use other computers for parallel computing I get the above error. When I change it to registerDoSNOW the error persists.

Does snow and DoSNOW work in a cluster? I could create nodes on the localhost using snow but not on the cluster. Anyone out there using snow?

like image 672
Rajendra Kumar Avatar asked Aug 01 '14 11:08

Rajendra Kumar


2 Answers

You can get this error if any of the workers are unable to load the doParallel package. You can make that happen by installing doParallel into some directory and pointing the master to it via ".libPaths":

> .libPaths('~/R/lib.test')
> library(doParallel)
> cl <- makePSOCKcluster(3, outfile='')
starting worker pid=26240 on localhost:11566 at 13:47:59.470
starting worker pid=26248 on localhost:11566 at 13:47:59.667
starting worker pid=26256 on localhost:11566 at 13:47:59.864
> registerDoParallel(cl)
> foreach(i=1:10) %dopar% i
Warning: namespace ‘doParallel’ is not available and has been replaced
by .GlobalEnv when processing object ‘’
Warning: namespace ‘doParallel’ is not available and has been replaced
by .GlobalEnv when processing object ‘’
Warning: namespace ‘doParallel’ is not available and has been replaced
by .GlobalEnv when processing object ‘’
Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  3 nodes produced errors; first error: object '.doSnowGlobals' not found

The warning happens when a function from doParallel is deserialized on a worker. The error happens when the function is executed and tries to access .doSnowGlobal which is defined in the doParallel namespace, not in .GlobalEnv.

You can also verify that doParallel is available on the workers by executing:

> clusterEvalQ(cl, library(doParallel))
Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  3 nodes produced errors; first error: there is no package called ‘doParallel’
like image 75
Steve Weston Avatar answered Oct 30 '22 23:10

Steve Weston


To set the library path on each worker you can run:

clusterEvalQ(cl, .libPaths("Your library path"))
like image 31
Nat Avatar answered Oct 31 '22 00:10

Nat