I am looking to utilise the caret package with a metric that is not one of the default options. For the example below I use the Metrics package. I have read all of the relevant questions on StackOverflow as well as the guide on the caret website but still am receiving errors.
In the example below I wish to use Mean Absolute Error.
create a function:
maefunction<-function(data, lev=NULL, model=NULL){
require(Metrics)
MAE<-mae(data[, "obs"], data[, "pred"])
out<-c(MAE)
out
}
Now I insert the function into the trainControl
library(caret)
GBM<-train(train$result~., data=train, method="gbm", trControl=trainControl(summaryFunction=maefunction), metric=MAE)
I receive the following message
Error in list_to_dataframe(res, attr(.data, "split_labels"), .id, id_as_factor) :
Results must be all atomic, or all data frames
In addition: Warning messages:
1: In if (metric %in% c("Accuracy", "Kappa")) stop(paste("Metric", :
the condition has length > 1 and only the first element will be used
2: In if (metric == "ROC" & !ctrl$classProbs) stop("train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()") :
the condition has length > 1 and only the first element will be used
3: In if (!(metric %in% perfNames)) { :
the condition has length > 1 and only the first element will be used
4: In train.default(x, y, weights = w, ...) :
The metric "4" was not in the result set. will be used instead.The metric "0.5" was not in the result set. will be used instead.
The caret package has several functions that attempt to streamline the model building and evaluation process. The train function can be used to evaluate, using resampling, the effect of model tuning parameters on performance
PyCaret provides "pycaret.classification.get_metrics ()" function. The get_metrics function returns table of available metrics used for CV. PyCaret provides "pycaret.classification.add_metric ()" function. The add_metric function adds a custom metric to be used for CV.
Couple excerpts from the caret handbook might help: Documentation for the caret package. For classification, ROC curve analysis is conducted on each predictor. For two class problems, a series of cutoffs is applied to the predictor data to predict the class. The sensitivity and specificity are computed for each cutoff and the ROC curve is computed.
or define a custom summary function that combines both twoClassSummary and prSummary current favorite which provides the following possible evaluation metrics - AUROC, Spec, Sens, AUPRC, Precision, Recall, F - any of which can be used as the metric argument.
I think that you have to use a named vector (see the example below). I didn't explicitly say that in the documentation so I will update that section.
Max
library(mlbench)
data(BostonHousing)
maeSummary <- function (data,
lev = NULL,
model = NULL) {
out <- mae(data$obs, data$pred)
names(out) <- "MAE"
out
}
mControl <- trainControl(summaryFunction = maeSummary)
marsGrid <- expand.grid(degree = 1, nprune = (1:10) * 2)
set.seed(1)
earthFit <- train(medv ~ .,
data = BostonHousing,
"earth",
tuneGrid = marsGrid,
metric = "MAE",
maximize = FALSE,
trControl = mControl)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With