I am using the package parallel to do computation. Here is a toy example:
library(parallel)
m = matrix(c(1,1,1,1,0.2,0.2,0.2,0.2), nrow=2)
myFun = function(x) {
if (any(x<0.5)) {
write("less than 0.5", stderr())
return(NA)
} else {
write("good", stdout())
return(mean(x))
}
}
cl = makeCluster(2, outfile="/tmp/output")
parApply(cl, m, 2, myFun)
stopCluster(cl)
The problem is both the stdout and the stderr will be redirected to /tmp/output. The output file looks like this:
starting worker pid=51083 on localhost:11953 at 11:37:12.966
starting worker pid=51093 on localhost:11953 at 11:37:13.261
good
good
less than 0.5
less than 0.5
Is there any way to setup two separate files for the stdout and the stderr, respectively? and how to ignore the first two lines of "starting worker pid=..."?
The parallel package doesn't directly support sending stdout and stderr to separate files, but you can do it yourself:
cl = makeCluster(2)
setup = function(outfile, errfile) {
assign("outcon", file(outfile, open="a"), pos=.GlobalEnv)
assign("errcon", file(errfile, open="a"), pos=.GlobalEnv)
sink(outcon)
sink(errcon, type="message")
}
shutdown = function() {
sink(NULL)
sink(NULL, type="message")
close(outcon)
close(errcon)
rm(outcon, errcon, pos=.GlobalEnv)
}
clusterCall(cl, setup, "/tmp/output", "/tmp/errmsg")
parApply(cl, m, 2, myFun)
clusterCall(cl, shutdown)
Since the "starting worker" messages are issued before setup is called, those messages are redirected to "/dev/null", which is the default behavior when outfile isn't specified.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With