I would like to know what are the various techniques
and metrics
used to evaluate how accurate/good an algorithm is and how to use a given metric to derive a conclusion about a ML model.
one way to do this is to use precision and recall, as defined here in wikipedia. Another way is to use the accuracy metric as explained here. So, what I would like to know is whether there are other metrics for evaluating an ML model?
Metrics like accuracy, precision, recall are good ways to evaluate classification models for balanced datasets, but if the data is imbalanced and there's a class disparity, then other methods like ROC/AUC, Gini coefficient perform better in evaluating the model performance.
Classification Metrics (accuracy, precision, recall, F1-score, ROC, AUC, …) Regression Metrics (MSE, MAE) Ranking Metrics (MRR, DCG, NDCG) Statistical Metrics (Correlation)
Accuracy, confusion matrix, log-loss, and AUC-ROC are some of the most popular metrics.
I've compiled, a while ago, a list of metrics used to evaluate classification and regression algorithms, under the form of a cheatsheet. Some metrics for classification: precision, recall, sensitivity, specificity, F-measure, Matthews correlation, etc. They are all based on the confusion matrix. Others exist for regression (continuous output variable).
The technique is mostly to run an algorithm on some data to get a model, and then apply that model on new, previously unseen data, and evaluate the metric on that data set, and repeat.
Some techniques (actually resampling techniques from statistics):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With