Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - problem with foreach %dopar% inside function called by optim

Calling a function that includes foreach %dopar% construct from optim causes an error:

> workers <- startWorkers(6) # 6 cores
> 
> registerDoSMP(workers)
> 
> t0 <- Sys.time() 
>
> optim(w,maxProb2,control=list(fnscale=-1))
> 
> Error in { : task 1 failed - "unused argument(s) (isPrebuilt = TRUE)"
> 
> Sys.time()-t0
>
> Time difference of 2.032 secs
> 
> stopWorkers(workers)

The called function looks like that:

> maxProb2 <- function(wp) {
>   
>   r <- foreach (i=s0:s1, .combine=c) %dopar% { pf(i,x[i,5],wp,isPrebuilt=TRUE) }
>   
>   cat("w=",wp,"max=",sum(r),"\n")
>   
>   sum(r)
>   
> }

pf is some other function, x is a static table of pre-computed elements.

Also calling the function to be optimized just once causes the same error:

> workers <- startWorkers(6) # 6 cores
>
> Warning message:
> In startWorkers(6) : there is an existing doSMP session using doSMP1
>
> registerDoSMP(workers)
>
> maxProb2(w)
> Error in { : task 1 failed - "unused argument(s) (isPrebuilt = TRUE)"
>
> stopWorkers(workers)

What's strange, the identical code works fine when called directly a single time (optim calles the same function many times):

> workers <- startWorkers(6) # 6 - ilosc rdzeni
> 
> Warning message:
> In startWorkers(6) : there is an existing doSMP session using doSMP1
>
> registerDoSMP(workers)
> 
> r <- foreach (i=s0:s1, .combine=c) %dopar% { pf(i,x[i,5],w,isPrebuilt=TRUE) } 
>   
> sum(r)
> [1] 187.1781
> 
> stopWorkers(workers)

The called function (maxProb2) works fine, when %do% is used instead of %dopar%.

How can I correctly call a function including a foreach %dopar% construction?

UPDATE 2011-07-17:

I have renamed the pf function into probf but the problem remains.

probf functions is defined in the script, not in some external package.

Two notes: OS: Windows 7, IDE: Revolution Analytics Enterprise 4.3

> workers <- startWorkers(workerCount = 3)
>
> registerDoSMP(workers)
>
> maxProb2(w)
>
Error in { : task 1 failed - "could not find function "probf""
like image 601
mjaniec Avatar asked Jul 14 '11 07:07

mjaniec


3 Answers

I ran into the same problem an the issue is with the environment not being included in the sub-threads. Your error

Error in { : task 1 failed - "could not find function "simple_fn""

can be reproduced by this very simple example:

simple_fn <- function(x)
    x+1

test_par <- function(){
    library("parallel")
    no_cores <- detectCores()
    library("foreach")
    cl<-makeCluster(no_cores)
    library("doSNOW")
    registerDoSNOW(cl)
    out <- foreach(i=1:10) %dopar% {
        simple_fn(i)
    }

    stopCluster(cl)
    return(out)
}

test_par()

Now all you need to to is to change the foreach(i=1:10) into foreach(i=1:10, .export=c("simple_fn")). If you want to export your complete global environment then just write .export=ls(envir=globalenv()) and you will have it for better or worse.

like image 56
Max Gordon Avatar answered Nov 05 '22 11:11

Max Gordon


[[Edited]]

Your pf function and your "static table" x must be distributed to all worker nodes. You must read the documentation for your parallel library on how that works.

It seems to be that when run through optim, the pf function it finds is another one (probably stats::pf, which does not have an isPrebuilt argument).

Can you try renaming your pf function (for example to mypf)?

mypf <- pf # renaming the function

maxProb2 <- function(wp) {
  r <- foreach (i=s0:s1, .combine=c) %dopar% { mypf(i,x[i,5],wp,isPrebuilt=TRUE) }
  cat("w=",wp,"max=",sum(r),"\n")
  sum(r)
}

Or, if your pf function is part of a package with a namespace (say, mypackage), you could reference it like this: mypackage::pf

maxProb2 <- function(wp) {
  r <- foreach (i=s0:s1, .combine=c) %dopar% { mypackage::pf(i,x[i,5],wp,isPrebuilt=TRUE) }
  cat("w=",wp,"max=",sum(r),"\n")
  sum(r)
}
like image 42
Tommy Avatar answered Nov 05 '22 11:11

Tommy


Quick fix for problem with foreach %dopar% is to reinstall these packages:

install.packages("doSNOW")

install.packages("doParallel") 

install.packages("doMPI")

As mentioned in various threads at StackOverflow, these are responsible for parallelism in R. Bug which existed in old versions of these packages is now removed. It worked in my case. I should mention that it will most likely help even though you are not using these packages in your project/package.

like image 28
M_D Avatar answered Nov 05 '22 10:11

M_D