I am trying to use parLapply
inside of another function not defined in the global environment. The worker function makes use of a list of other functions that I want to clusterExport
beforehand, which are also not defined in the global environment. My problem is that both functions export their evaluation environments to the clusters, which are huge and not needed.
Let us call the the worker function workerFunction
and the function list functionList
.
workerFunction <- function(i) {
intermediateOutput <- functionList[[i]](y)
result <- otherCalculations(intermediateOutput)
return(result)
}
library(parallel)
cl <- makeCluster(detectCores())
environment(workerFunction) <- .GlobalEnv
environment(functionList) <- .GlobalEnv
clusterExport(cl, varlist=c("functionList", "y"), envir=.GlobalEnv)
output <- parLapply(cl, inputVector, workerFunction)
I get:
Error in get(name, envir = envir) (from <text>#53) : object 'functionList' not found
If I don´t set environment(functionList) <- .GlobalEnv
, then the huge enclosing environment of functionList
is exported to the clusters. Why can´t R find functionList
in the global environment?
It's hard to guess the problem without a complete example, but I'm wondering if the error message isn't coming from clusterExport
, rather than parLapply
. That would happen if functionList
was defined in a function rather than the global environment, since the clusterExport
envir
argument specifies the environment from which to export the variables.
To export variables defined in a function, from that same function, you would use:
clusterExport(cl, varlist=c("functionList", "y"), envir=environment())
I'm just guessing this might be a problem for you since I don't know how or where you defined functionList
. Note that clusterExport
always assigns the variables to the global environment of the cluster workers.
I'm also suspicious of the way that you are apparently setting the environment of a list: that seems to be legal, but I don't think it will change the environment of functions in that list. In fact, I suspect that exporting functions to the workers in a list may have other problems that you haven't encountered yet. I would use something like this:
mainFunction <- function(cl) {
fa <- function(x) fb(x)
fb <- function(x) fc(x)
fc <- function(x) x
y <- 7
workerFunction <- function(i) {
do.call(functionNames[[i]], list(y))
}
environment(workerFunction) <- .GlobalEnv
environment(fa) <- .GlobalEnv
environment(fb) <- .GlobalEnv
environment(fc) <- .GlobalEnv
functionNames <- c("fa", "fb", "fc")
clusterExport(cl, varlist=c("functionNames", functionNames, "y"),
envir=environment())
parLapply(cl, seq_along(functionNames), workerFunction)
}
library(parallel)
cl <- makeCluster(detectCores())
mainFunction(cl)
stopCluster(cl)
Note that I've taken liberties with your example, so I'm not sure how well this corresponds with your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With