Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing an entire package to a snow cluster

I'm trying to parallelize (using snow::parLapply) some code that depends on a package (ie, a package other than snow). Objects referenced in the function called by parLapply must be explicitly passed to the cluster using clusterExport. Is there any way to pass an entire package to the cluster rather than having to explicitly name every function (including a package's internal functions called by user functions!) in clusterExport?

like image 466
Michael Avatar asked Sep 02 '12 01:09

Michael


1 Answers

Install the package on all nodes, and have your code call library(thePackageYouUse) on all nodes via one the available commands, egg something like

 clusterApply(cl, library(thePackageYouUse))

I think the parallel package which comes with recent R releases has examples -- see for example here from help(clusterApply) where the boot package is loaded everywhere:

 ## A bootstrapping example, which can be done in many ways:
 clusterEvalQ(cl, {
   ## set up each worker.  Could also use clusterExport()
   library(boot)
   cd4.rg <- function(data, mle) MASS::mvrnorm(nrow(data), mle$m, mle$v)
   cd4.mle <- list(m = colMeans(cd4), v = var(cd4))
   NULL
 })
like image 117
Dirk Eddelbuettel Avatar answered Oct 24 '22 04:10

Dirk Eddelbuettel