Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R-Caret, caretList, The metric "Accuracy" was not in the result set

Tags:

r

r-caret

Trying to learn r-Caret and caretList. I am trying to follow the tutorial caretEnsemble Classification example

I have encountered a few errors and searched how to fix some of the basic set up. However, I am getting the error:

Warning messages:
1: In train.default(x, y, weights = w, ...) :
The metric "Accuracy" was not in the result set. ROC will be used instead.
2: In train.default(x, y, weights = w, ...) :
The metric "Accuracy" was not in the result set. ROC will be used instead.

My setup is:

#Libraries
library(caret)
library(devtools)
library(caretEnsemble)

#Data
library(mlbench)
dat <- mlbench.xor(500, 2)
X <- data.frame(dat$x)
Y <- factor(ifelse(dat$classes=='1', 'Yes', 'No'))

#Split train/test
train <- runif(nrow(X)) <= .66

#Setup CV Folds
#returnData=FALSE saves some space
folds=5
repeats=1
myControl <- trainControl(method='cv', 
                      number=folds, 
                      repeats=repeats, 
                      returnResamp='none', 
                      classProbs=TRUE,
                      returnData=FALSE, 
                      savePredictions=TRUE, 
                      verboseIter=TRUE, 
                      allowParallel=TRUE,
                      summaryFunction=twoClassSummary,
                      index=createMultiFolds(Y[train], 
                                             k=folds, 
                                             times=repeats)
)
#Make list of all models
all.models<-caretList(Y~., data=X, trControl=myControl, methodList=c("blackboost", "parRF"))

I edited the section of "train all models" using caretList so that it will work with caretEnsemble and caretStack further down the code (link provided above).

How do I get the accuracies so that I can use them in caretEnsemble and caretStack?

like image 212
ifeelstupid Avatar asked Dec 18 '22 18:12

ifeelstupid


1 Answers

I assume you would like to use 'Accuracy' as the summary metric that should be used to select the optimal base learner models across their resamples and the metalearner later on via caretEnsemble or caretStack.

In this case you must not set summaryFunction = twoClassSummary in trainControl because like this train will use 'ROC' as the performance metric and not 'Accuracy'. Instead you should go with the default setting for summaryFunction (That means you do not have to specify it explicitly in trainControl). Like this train which is called via caretList will automatically use 'Accuracy' as the performance metric because of the categorical response.

In addition, there a few other things to note:

  • You should not set returnResamp = FALSE in trainControl. Because when you do, you won't be able to compare the model's individual accuracies later via summary(resamples(model.list))
  • Even though you created an index to separate the data into a train and test set you don't use it when passing the data to caretList. The correct caretList call should begin like this caretList(Y[train] ~ ., data=X[train, ], ...
  • The tutorial you mentioned above is a bit outdated. You should also check out the package's current vignette and this tutorial from MachineLearningMastery. The latter also uses "Accuracy" as the performance metric in its example
like image 169
alex23lemm Avatar answered Mar 30 '23 00:03

alex23lemm