In order to see the console messages output by a function running in a foreach()
loop I followed the advice of this guy and added a sink()
call like so:
library(foreach)
library(doMC)
cores <- detectCores()
registerDoMC(cores)
X <- foreach(i=1:100) %dopar%{
sink("./out/log.branchpies.txt", append=TRUE)
cat(paste("\n","Starting iteration",i,"\n"), append=TRUE)
myFunction(data, argument1="foo", argument2="bar")
}
However, at iteration 77 I got the error 'sink stack is full'. There are well-answered questions about avoiding this error when using for-loops, but not foreach. What's the best way to write the otherwise-hidden foreach output to a file?
This runs without errors on my Mac:
library(foreach)
library(doMC)
cores <- detectCores()
registerDoMC(cores)
X <- foreach(i=1:100) %dopar%{
sink("log.branchpies.txt", append=TRUE)
cat(paste("\n","Starting iteration",i,"\n"))
sink() #end diversion of output
rnorm(i*1e4)
}
This is better:
library(foreach)
library(doMC)
cores <- detectCores()
registerDoMC(cores)
sink("log.branchpies.txt", append=TRUE)
X <- foreach(i=1:100) %dopar%{
cat(paste("\n","Starting iteration",i,"\n"))
rnorm(i*1e4)
}
sink() #end diversion of output
This works too:
library(foreach)
library(doMC)
cores <- detectCores()
registerDoMC(cores)
X <- foreach(i=1:100) %dopar%{
cat(paste("\n","Starting iteration",i,"\n"),
file="log.branchpies.txt", append=TRUE)
rnorm(i*1e4)
}
As suggested by this guy , it is quite tricky to keep track of the sink stack. It is, therefore advised to use ability of cat
to write to file, such as suggested in the answer above:
cat(..., file="log.txt", append=TRUE)
To save some typing you could create a wrapper function that diverts output to file every time cat
is called:
catf <- function(..., file="log.txt", append=TRUE){
cat(..., file=file, append=append)
}
So that at the end, when you call foreach
you would use something like this:
library(foreach)
library(doMC)
cores <- detectCores()
registerDoMC(cores)
X <- foreach(i=1:100) %dopar%{
catf(paste("\n","Starting iteration",i,"\n"))
rnorm(i*1e4)
}
Hope it helps!
Unfortunately, none of the abovementioned approaches worked for me: With sink()
within the foreach()
-loop, it did not stop to throw the "sink stack is full"-error. With sink()
outside the loop, the file was created, but never updated.
To me, the easiest way of creating a log-file to keep track of a parallelised foreach()
-loop's progress is by applying the good old write.table()
-function.
library(foreach)
library(doParallel)
availableClusters <- makeCluster(detectCores() - 1) #use all cpu-threads but one (i.e. one is reserved for the OS)
registerDoParallel(availableClusters) #register the available cores for the parallisation
x <- foreach (i = 1 to 100) %dopar% {
log.text <- paste0(Sys.time(), " processing loop run ", i, "/100")
write.table(log.text, "loop-log.txt", append = TRUE, row.names = FALSE, col.names = FALSE)
#your statements here
}
And don't forget (as I did several times...) to use append = TRUE
within write.table()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With