Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel and Multicore Processing in R [closed]

This is towards the extreme in R's capabilities I think, but here goes...

I'm doing some heavy processing in R in which I've written a function which does all the leg work from a single call. However, I'd like to thread or utilise more than a single core.

I've looked at the Parallel package, which comes up as deprecated. I'd ideally like to call function as a new thread.

I understand the complexities of parallel computing and that it's not the easiest thing in the world, but I'd appreciate it if anyone knew of some packages that would be useful or anything I've overlooked.

Cheers

like image 688
A_Skelton73 Avatar asked Jun 11 '13 21:06

A_Skelton73


1 Answers

The multicore package is deprecated: not parallel. Take a look at the documentation for the mclapply function: it's the easiest way to execute functions in parallel in the parallel package. It's very similar to lapply but with a few new, optional arguments:

library(parallel)
myfun <- function(i) { Sys.sleep(1); i }
mclapply(1:8, myfun, mc.cores=4)

Note that mclapply uses processes, not threads, and doesn't support parallel execution on Windows. For Windows, you should take a look at parLapply, which is also in parallel. It is also similar to lapply, but requires a cluster object as the first argument. Here's the same example, but this works on essentially any platform:

library(parallel)
cl <- makePSOCKcluster(4)
myfun <- function(i) { Sys.sleep(1); i }
parLapply(cl, 1:8, myfun)
stopCluster(cl)
like image 53
Steve Weston Avatar answered Oct 24 '22 17:10

Steve Weston