I am writing a function to combine and organize data then run MCMC chains in parallel using the parallel function in base R. My function is below.
dm100zip <- function(y, n.burn = 1, n.it = 3000, n.thin = 1) {
y <- array(c(as.matrix(y[,2:9]), as.matrix(y[ ,10:17])), c(length(y$Plot), 8, 2))
nplots <- nrow(y)
ncap1 <- apply(y[,1:8, 1],1,sum)
ncap2 <- apply(y[,1:8, 2],1,sum)
ncap <- as.matrix(cbind(ncap1, ncap2))
ymax1 <- apply(y[,1:8, 1],1,sum)
ymax2 <- apply(y[,1:8, 2],1,sum)
# Bundle data for JAGS/BUGS
jdata100 <- list(y=y, nplots=nplots, ncap=ncap)
# Set initial values for Gibbs sampler
inits100 <- function(){
list(p0=runif(1, 1.1, 2),
p.precip=runif(1, 0, 0.1),
p.day = runif(1, -.5, 0.1))
}
# Set parameters of interest to monitor and save
params100 <- c("N", "p0")
# Run JAGS in parallel for improved speed
CL <- makeCluster(3) # set number of clusters = to number of desired chains
clusterExport(cl=CL, list("jdata100", "params100", "inits100", "ymax1", "ymax2", "n.burn", "jag", "n.thin")) # make data available to jags in diff cores
clusterSetRNGStream(cl = CL, iseed = 5312)
out <- clusterEvalQ(CL, {
library(rjags)
load.module('glm')
jm <- jags.model("dm100zip.txt", jdata100, inits100, n.adapt = n.burn, n.chains = 1)
fm <- coda.samples(jm, params100, n.iter = n.it, thin = n.thin)
return(as.mcmc(fm))
})
out.list <- mcmc.list(out) # group output from each core into one list
stopCluster(CL)
return(out.list)
}
When I run the function I get an error that n.burn, n.it, and n.thin are not found for use in the clusterExport
function. For example,
dm100zip.list.nain <- dm100zip(NAIN, n.burn = 1, n.it = 3000, n.thin = 1) # returns error
If I set values for each of them before running the function, then it uses those values and runs fine. For example,
n.burn = 1
n.it = 1000
n.thin = 1
dm100zip.list.nain <- dm100zip(NAIN, n.burn = 1, n.it = 3000, n.thin = 1)
This runs fine but uses n.it = 1000 not 3000
Can someone help with why the objects in the global environment are used by the ClusterExport
function but not the values assigned by the function that ClusterExport
is run within? Is there a way around this?
By default, clusterExport looks for the variables specified by "varlist" in the global environment. In your case, it should look in the local environment of the dm100zip function. To make it do that, you use the clusterExport "envir" argument:
clusterExport(cl=CL, list("jdata100", "params100", "inits100", "ymax1",
"ymax2", "n.burn", "jag", "n.thin"),
envir=environment())
Note that variables in "varlist" that are defined in the global environment will also be found, but values defined in dm100zip will take precedence.
Since function arguments in R are processed with lazy evaluation, you need to ensure that any default arguments actually exist in the function's execution environment. In fact, the R core authors included the force
function for this purpose, which is simply function(x) x
and forces the conversion of the argument from a promise into an evaluated expression. Try making the following modification:
dm100zip <- function(y, n.burn = 1, n.it = 3000, n.thin = 1) {
force(n.burn); force(n.it); force(n.thin)
# The rest of your code as above...
}
For a more detailed explanation of these issues, consult the Lazy Evaluation section of Hadley's treatment of functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With