Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Send function calls with different arguments to different processors in R using parallel package

I am trying to use the parallel package in R to send four different function calls to four different processors but am really lost as to how to assign different cores to do different work. I've read through the documentation for the parallel package, doParallel, Rmpi, and foreach in R. I've seen many posts using mclapply for calling different functions with the same argument. I'd like to call the same function with different arguments.

This is pseudocode of what I'd like to accomplish:

BEGIN parallel (core)
if(core == 1)
   foo(5, 4, 1/2, 3, "a")
if(core == 2)
   foo(5, 3, 1/3, 1, "b")
if(core == 3)
   foo(5, 4, 1/4, 1, "c")
if(core == 4)
   foo(5, 2, 1/5, 0, "d")
END parallel

This seems to be a perfect application to parallel computing since these four separate function calls can act independently to solve the problem I am working on. I don't know how to do this in R though.

like image 559
compstatguy Avatar asked Jul 30 '14 20:07

compstatguy


People also ask

Can R do parallel processing?

There are various packages in R which allow parallelization. “parallel” Package The parallel package in R can perform tasks in parallel by providing the ability to allocate cores to R. The working involves finding the number of cores in the system and allocating all of them or a subset to make a cluster.

Does Lapply run in parallel?

The parallel library, which comes with R as of version 2.14. 0, provides the mclapply() function which is a drop-in replacement for lapply. The "mc" stands for "multicore," and as you might gather, this function distributes the lapply tasks across multiple CPU cores to be executed in parallel.

What is parallel package in R?

The parallel package which comes with your R installation. It represents a combining of two historical packages–the multicore and snow packages, and the functions in parallel have overlapping names with those older packages.

How do I use multiple cores in R?

If you are on a single host, a very effective way to make use of these extra cores is to use several R instances at the same time. The operating system will indeed always assign a different core to each new R instance. In Linux, just open several the terminal windows. Then within each terminal, type R to open R.


1 Answers

You could use the clusterApply function from the parallel package:

library(parallel)
cl <- makeCluster(4)
clusterExport(cl, "foo")
cores <- seq_along(cl)
r <- clusterApply(cl[cores], cores, function(core) {
  if (core == 1) {
    foo(5, 4, 1/2, 3, "a")
  } else if (core == 2) {
    foo(5, 3, 1/3, 1, "b")
  } else if (core == 3) {
    foo(5, 4, 1/4, 1, "c")
  } else if (core == 4) {
    foo(5, 2, 1/5, 0, "d")
  }
})

This is very similar to your pseudocode and demonstrates how you can direct particular tasks to particular cluster workers using clusterApply. Note that by changing the value of cores, you can execute on any subset of the cluster workers that you choose.

If a "core ID" isn't really important, you can pass different arguments to the function by iterating over vectors for each of the arguments using the foreach package:

library(doParallel)
registerDoParallel(cl)
r2 <- foreach(a1=c(5,5,5,5), a2=c(4,3,4,2), a3=c(1/2,1/3,1/4,1/5),
              a4=c(3,1,1,0), a5=c("a","b","c","d")) %dopar% {
  foo(a1, a2, a3, a4, a5)
}
like image 152
Steve Weston Avatar answered Nov 03 '22 07:11

Steve Weston