Is it possible to use log_loss metric in gridsearchcv?
I have seen few posts where people mentioned about neg_log_loss? Is it same as log_loss? If not is it possible to use log_loss directly in gridsearchcv?
As stated in the documentation, scoring may take different inputs: string, callable, list/tuple, dict or None. If you use strings, you can find a list of possible entries here.
There, as a string representative for log loss, you find "neg_log_loss", i.e. the negative log loss, which is simply the log loss multiplied by -1. This is an easy way to deal with a maximization problem (which is what GridSearchCV expects, because it requires a score parameter, not a loss parameter), instead of a minimization one (you want the minimum log loss, which is equivalente to the maximum negative log loss).
If instead you want to directly pass a log loss function to the GridSearchCV, you just have to create a scorer from the Scikit-learn log_loss function by using make_scorer:
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import log_loss, make_scorer
iris = datasets.load_iris()
parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
svc = svm.SVC(gamma="scale", probability=True)
LogLoss = make_scorer(log_loss, greater_is_better=False, needs_proba=True)
clf = GridSearchCV(svc, parameters, cv=5, scoring=LogLoss)
clf.fit(iris.data, iris.target)
print(clf.best_score_, clf.best_estimator_)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With