Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use parRF method so random forest will run faster

I would like to run random forest on a large data set: 100k * 400. When I use random forest it takes a lot of time. Can I use parRF method from caret package in order to reduce running time? What is the right syntax for that? Here is an example dataframe:

dat <- read.table(text = " TargetVar  Var1    Var2       Var3
 0        0        0         7
 0        0        1         1
 0        1        0         3
 0        1        1         7
 1        0        0         5
 1        0        1         1
 1        1        0         0
 1        1        1         6
 0        0        0         8
 0        0        1         5
 1        1        1         4
 0        0        1         2
 1        0        0         9
 1        1        1         2  ", header = TRUE)

I tried:

library('caret')
m<-randomForest(TargetVar ~ Var1 + Var2 + Var3, data = dat, ntree=100, importance=TRUE, method='parRF')

But I don't see too much of a difference. Any Ideas?

like image 878
mql4beginner Avatar asked Jan 11 '23 15:01

mql4beginner


1 Answers

The reason that you don't see a difference is that you aren't using the caret package. You do load it into your environment with the library() command, but then you run randomForest() which doesn't use caret.

I'll suggest starting by creating a data frame (or data.table) that contains only your input variables, and a vector containing your outcomes. I'm referring to the recently updated caret docs.

x <- data.frame(dat$Var1, dat$Var2, dat$Var3)
y <- dat$TargetVar

Next, verify that you have the parRF method available. I didn't until I updated my caret package to the most recent version (6.0-29).

library("randomForest")
library("caret")
names(getModelInfo())

You should see parRF in the output. Now you're ready to create your training model.

library(foreach)

rfParam <- expand.grid(ntree=100, importance=TRUE)

m <- train(x, y, method="parRF", tuneGrid=rfParam)
like image 128
Lenwood Avatar answered Jan 21 '23 01:01

Lenwood