I am running into a problem with the foreach
section of a program I am working with in R. The program is used to run simulations for varying parameters, and then return the results to a single list which is then used to generate a report.
The problem occurs when not all simulation runs assigned are actually visible on the report. In all ways, it appears as though only a subset of the assigned runs were actually
assigned.
This is more likely to take place with larger data sets (longer time periods for a simulation, for example).
It is less likely to occur with a fresh run of the program, and more likely to occur if something is already taking up RAM.
The memory use graph for system monitor sometimes peaks at 100% RAM and 100% swap, and then dips sharply, after which time one of the four child R sessions has disappeared.
When using .verbose
in foreach()
, the log file shows that the simulation runs that do not get shown in the report are returned as NULL
, while those which do get shown in the report are returned as normal (a list of data frames and character variables).
The same set of parameters can produce this effect or can produce a complete graph; that is, the set of parameters is not diagnostic.
foreach()
is used for approximately a dozen parameters. .combine
is cbind
, .inorder
is false, all other internal parameters such as .errorhandling
are default.
This is of course quite irritating, since the simulations can take upwards of twenty minutes to run only to turn out to be useless due to missing data. Is there a way to either ensure that these "dropped" sessions are not dropped, or that if they are then this is in some way caught?
(If it's important, the computer being used has eight processors and hence runs four child processes, and the parallel operator registered is from the DoMC
package)
The code is structured roughly as follows:
test.results <- foreach(parameter.one = parameter.one.space, .combine=cbind) %:%
foreach(parameter.two = parameter.two.space, .combine=cbind) %:%
...
foreach(parameter.last = parameter.last.space, .combine=cbind, .inorder=FALSE) %dopar%
{
run.result <- simulationRun(parameter.one,
parameter.two,
...
parameter.last)
list(list(parameters=list(parameter.one,
parameter.two,
...
parameter.last),
runResult <- run.result))
}
return(test.results)
I'm guessing that you're running on Linux, because from your description, it sounds like the child R session is being killed by the Linux "out-of-memory killer". Coincidentally, I recently worked on the same basic problem where mclapply was used directly.
The doMC package uses the mclapply function to execute the foreach loop in parallel, and unfortunately, mclapply doesn't signal an error when a worker process unexpectedly dies. Instead, mclapply returns a NULL for all tasks allocated to that worker. I don't think there is any option to change this behavior in mclapply.
The only work-arounds that I can think of are:
If you use doParallel, make sure that you create and register a cluster object, otherwise mclapply will be used on Linux systems. With doParallel and doSNOW, if a worker dies abnormally, the master will get an error getting the task result from the dead worker:
Error in unserialize(node$con) : error reading from connection
In this case, the parallel backend will catch the error and use the specified error handling.
Keep in mind that using doParallel or doSNOW may use more memory than doMC, and so you may have to specify fewer workers with them in order to avoid running out of memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With