Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using source() within parallel foreach loops

Tags:

foreach

r

Here is a toy example to illustrate my problem.

library(foreach)
library(doMC)
registerDoMC(cores=2)

foreach(i = 1:2) %dopar%{
  i + 2
}
[[1]]
[1] 3

[[2]]
[1] 4

So far so good...

But if the code i + 2 is saved in the file addition.R and that I call that file using source() then

> foreach(i = 1:2) %dopar%{
+   source("addition.R")
+ }
Error in { : task 1 failed - "object 'i' not found"
like image 653
Marco Avatar asked Feb 16 '15 07:02

Marco


People also ask

How do you handle exceptions in parallel ForEach?

For and Parallel. ForEach overloads do not have any special mechanism to handle exceptions that might be thrown. In this respect, they resemble regular for and foreach loops ( For and For Each in Visual Basic); an unhandled exception causes the loop to terminate as soon as all currently running iterations finish.

Does ForEach work in parallel?

ForEach loop works like a Parallel. For loop. The loop partitions the source collection and schedules the work on multiple threads based on the system environment. The more processors on the system, the faster the parallel method runs.

Does parallel ForEach use ThreadPool?

Parallel. ForEach uses managed thread pool to schedule parallel actions. The number of threads is set by ThreadPool.

Is parallel ForEach blocking?

No, it doesn't block and returns control immediately. The items to run in parallel are done on background threads.


2 Answers

I cannot fully reproduce your toy, but I had a smiliar problem, which I was able to solve by:

source(file, local = TRUE)

which should parse the source in the local environment, i.e. recognizing i.

like image 178
Sosel Avatar answered Oct 15 '22 18:10

Sosel


The comment by NiceE and the answer by Sosel already address this; when calling source(file) it defaults to source(file, local = FALSE), which means that the code in the file sourced is evaluating in the global environment ("user's workspace") and there is, cf. ?source. Note that there is no variable i in the global environment. The solution is to make sure the file sourced in the environment that calls it, i.e. to use source(file, local = TRUE).

Solution:

library("foreach")

y <- foreach(i = 1:2) %dopar% {
  i + 2
}
str(y)

doMC::registerDoMC(cores = 2L)
y <- foreach(i = 1:2) %dopar% {
  source("addition.R", local = TRUE)
}
str(y)

Example of the same problem with a for() loop:

The fact that source() is evaluated in the global environment which is different from the calling environment where i lives can also be illustrated using a regular for loop by running the for loop in another environment than the global, e.g. inside a function or by:

local({
  for(i in 1:2) {
    source("addition.R")
  }
})

which gives:

Error in eval(ei, envir) : object 'i' not found

Now, the reason why the above foreach(i = 1:2) %dopar% { source("addition.R") } works with registerDoSEQ() if and only if called from the global environment, is that then the foreach iteration is evaluated in the calling environment, which is the global environment, which is the environment that source() uses. However, if one used local(foreach(i = 1:2) %dopar% { ... }) also this fails analoguously to the above local(for(i in 1:2) { ... }) call.

In conclusion: nothing magic happens, but to understand it is a bit tedious.

like image 23
HenrikB Avatar answered Oct 15 '22 18:10

HenrikB