Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel Processing in R in caret

It is given in the caret documentation that to allow parallel processing the following code works

library(doMC) 
registerDoMC(cores = 5) 
## All subsequent models are then run in parallel

But in the latest R version(3.4) the package doMC is not available. Can anyone let me know of any other way to do parallel processing?

Update : What Roman suggested worked. DoMC is not available for windows. For windows use doParallel package cls = makeCluster(no of cores to use) and then registerDoParallel(cls) . Also make sure allowParallel is set to TRUE in trControl.

like image 401
Dhruv Mahajan Avatar asked Jun 27 '17 07:06

Dhruv Mahajan


2 Answers

Just to expand on the implementation of the previous answers and basically using the Caret package documentation, here is a recipe that works for me:

set.seed(112233)
library(parallel) 
# Calculate the number of cores
no_cores <- detectCores() - 1

library(doParallel)
# create the cluster for caret to use
cl <- makePSOCKcluster(no_cores)
registerDoParallel(cl)

# do your regular caret train calculation enabling
# allowParallel = TRUE for the functions that do
# use it as part of their implementation. This is
# determined by the caret package.

stopCluster(cl)
registerDoSEQ()
like image 95
Pablo Adames Avatar answered Sep 22 '22 01:09

Pablo Adames


doMC taps into the power of package multicore to calculate in distributed/parallel mode. This is fine, if you're on supported platforms, which Windows isn't.

You can use another framework, like parallel which comes shipped with R. To do so, you will need package doParallel which works on all three major platforms.

like image 36
Roman Luštrik Avatar answered Sep 23 '22 01:09

Roman Luštrik