Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caret in R: Set number of cores for allowParallel?

I am using R's caret package, and in the training function (train) I use the allowParallel Parameter, which works. However, it uses all of the cores, and since the training runs on my local PC I would rather leave one core for myself to be able to work while training models. Is there any way to do this?

From what I've gathered it seems that different model types might use different parallelization packages. I am working on windows, so I guess it's not using doMC (where I know how to set the number of cores...)

like image 751
Thomas Avatar asked Jan 08 '18 08:01

Thomas


2 Answers

So after more research, I found a way to use the number of cores I want: train has the option to directly specify the number of cores to use with num.threads = 7 (for 7 out of 8 cores)

rf_model<-train(Target~., data = df_tree_train, method = "ranger",
                trControl = trainControl(method = "oob"
                                       , verboseIter  = TRUE
                                       , allowParallel = TRUE
                                       , classProbs = TRUE
                )
                , verbose = T
                , tuneGrid = tuneGrid
                , num.trees = 50
                , num.threads = 7  # <- This one
)
like image 158
Thomas Avatar answered Oct 17 '22 12:10

Thomas


I'm surprised that:

library("doParallel")
registerParallel(parallel::detectCores() - 1)

does not do it. Maybe there is recursive parallelism that does not acknowledge the above. You could try with the doFuture package:

library("doFuture")
registerDoFuture()
plan(multisession, workers = availableCores() - 1)

EDIT: 2022-01-29: The 'multiprocess' backend is deprecated, in favor of 'multisession'. If you want forked parallel processing, use 'multicore'. which should protected against unwanted nested parallelism.

like image 33
HenrikB Avatar answered Oct 17 '22 13:10

HenrikB