In R
, I have a list of matrices and would like to apply a summary function to the matrices in the list. The matrices represent social networks, therefore I need to apply some specialized summary functions provided by the ergm
package. These summary statistics are contained in a summary method. I can write a function as a wrapper around this summary method and use lapply
to apply the function to the list of matrices.
However, when I try to parallelize this by using parLapply
or parSapply
from the parallel
package, the results look weird. And when I export the summary.statistics
function, I even get an error message.
Do I have to export the summary method that is provided by the ergm package to the cluster object? If so, how? The following code is a self-contained example.
library("ergm")
library("parallel")
# create list of matrices
m <- matrix(rbinom(900, 1, 0.1), nrow = 30)
l <- list(m, m, m, m, m)
# write wrapper function that computes results
fun <- function(mat) {
s <- summary(mat ~ edges + dsp(1))
return(s)
}
cl <- makePSOCKcluster(2) # create cluster object
test1 <- sapply(l, fun) # works!
test2 <- parSapply(cl, l, fun) # problem: results look weird!
clusterExport(cl, varlist = "summary.statistics")
test3 <- parSapply(cl, l, fun) # problem: says method is not applicable!
Instead of exporting functions that are defined in packages, you should load the package in the workers using something like:
clusterEvalQ(cl, library("ergm"))
You should always load all of the packages needed by the worker function, since they aren't loaded automatically just because the package has been loaded by the master.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With