Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: How can I export methods provided by a package to a PSOCK cluster?

In R, I have a list of matrices and would like to apply a summary function to the matrices in the list. The matrices represent social networks, therefore I need to apply some specialized summary functions provided by the ergm package. These summary statistics are contained in a summary method. I can write a function as a wrapper around this summary method and use lapply to apply the function to the list of matrices.

However, when I try to parallelize this by using parLapply or parSapply from the parallel package, the results look weird. And when I export the summary.statistics function, I even get an error message.

Do I have to export the summary method that is provided by the ergm package to the cluster object? If so, how? The following code is a self-contained example.

library("ergm")
library("parallel")

# create list of matrices
m <- matrix(rbinom(900, 1, 0.1), nrow = 30)
l <- list(m, m, m, m, m)

# write wrapper function that computes results
fun <- function(mat) {
  s <- summary(mat ~ edges + dsp(1))
  return(s)
}

cl <- makePSOCKcluster(2)  # create cluster object

test1 <- sapply(l, fun)  # works!
test2 <- parSapply(cl, l, fun)  # problem: results look weird!

clusterExport(cl, varlist = "summary.statistics")
test3 <- parSapply(cl, l, fun)  # problem: says method is not applicable!
like image 935
Philip Leifeld Avatar asked Nov 17 '15 15:11

Philip Leifeld


1 Answers

Instead of exporting functions that are defined in packages, you should load the package in the workers using something like:

clusterEvalQ(cl, library("ergm"))

You should always load all of the packages needed by the worker function, since they aren't loaded automatically just because the package has been loaded by the master.

like image 130
Steve Weston Avatar answered Oct 31 '22 04:10

Steve Weston