I tune the mtry parameter of randomForest using the train function from the caret package. There are only 48 columns in my X data, however train returns mtry=50 as the best value whereas this is not a valid value (>48). What is the explanation of that ?
> dim(X)
[1] 93 48
> fit <- train(level~., data=data.frame(X,level), tuneLength=13)
> fit$finalModel
Call:
randomForest(x = x, y = y, mtry = param$mtry)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 50
OOB estimate of error rate: 2.15%
Confusion matrix:
high low class.error
high 81 1 0.01219512
low 1 10 0.09090909
It is even worse if I don't set the tuneLength parameter:
> fit <- train(level~., data=data.frame(X,level))
> fit$finalModel
Call:
randomForest(x = x, y = y, mtry = param$mtry)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 55
OOB estimate of error rate: 2.15%
Confusion matrix:
high low class.error
high 81 1 0.01219512
low 1 10 0.09090909
I don't provide the data cause it is confidential. But there's nothing special in these data: each column is numerical or is a factor, and there are no missing value.
The apparent discrepancy is most likely[1] between the number of columns in your data set and the number of predictors, which may not be the same if any of the columns are factors. You used the formula method, which will expand the factors into dummy variables. For example:
> head(model.matrix(Sepal.Width ~ ., data = iris))
(Intercept) Sepal.Length Petal.Length Petal.Width Speciesversicolor Speciesvirginica
1 1 5.1 1.4 0.2 0 0
2 1 4.9 1.4 0.2 0 0
3 1 4.7 1.3 0.2 0 0
4 1 4.6 1.5 0.2 0 0
5 1 5.0 1.4 0.2 0 0
6 1 5.4 1.7 0.4 0 0
So there are 3 predictor columns in iris but you end up with 5 (non-intercept) predictors.
Max
[1] This is why you need to provide a reproducible example. Often, when I get ready to ask a question, the answer becomes apparent while I take the time to write a good description of the issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With