Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do GridSearchCV for F1-score in classification problem with scikit-learn?

I'm working on a multi classification problem with a neural network in scikit-learn and I'm trying to figure out how I can optimize my hyperparameters (amount of layers, perceptrons, other things eventually).

I found out that GridSearchCV is the way to do it but the code that I'm using returns me the average accuracy while I actually want to test on the F1-score. Does anyone have an idea about how I can edit this code to make it work for the F1-score?

In the beginning when I had to evaluate the precision/accuracy I thought it was 'enough' to just take the confusion matrix and make a conclusion out of it, while doing trial-and-error changing the amount of layers and perceptrons in my neural network again and again.

Today I figured out that there's more than that: GridSearchCV. I just need to figure out how i can evaluate the F1-score because I need to do a research on determining the accuracy from the neural network in terms of the layers, nodes, and eventually other alternatives...

mlp = MLPClassifier(max_iter=600)
clf = GridSearchCV(mlp, parameter_space, n_jobs= -1, cv = 3)
clf.fit(X_train, y_train.values.ravel())

parameter_space = {
    'hidden_layer_sizes': [(1), (2), (3)],
}

print('Best parameters found:\n', clf.best_params_)

means = clf.cv_results_['mean_test_score']
stds = clf.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, clf.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r" % (mean, std * 2, params))

output:

Best parameters found:
 {'hidden_layer_sizes': 3}
0.842 (+/-0.089) for {'hidden_layer_sizes': 1}
0.882 (+/-0.031) for {'hidden_layer_sizes': 2}
0.922 (+/-0.059) for {'hidden_layer_sizes': 3}

So here my output gives me the mean accuracy (which I found is default on GridSearchCV). How can I change this to return the average F1-score instead of accuracy?

like image 496
Jonas Avatar asked May 10 '19 20:05

Jonas


People also ask

How to implement the F1 score matrix in Python?

Actually, In order to implement the f1 score matrix, we need to import the below package. As F1 score is the part of sklearn.metrics package. Here is the complete syntax for F1 score function. Here y_true and y_pred are the required parameters. Others are optional and not required parameter.

How to use F1 score function in sklearn?

As F1 score is the part of sklearn.metrics package. from sklearn.metrics import f1_score. Here is the complete syntax for F1 score function. sklearn.metrics.f1_score(y_true, y_pred, *, labels=None, pos_label=1, average='binary', sample_weight=None, zero_division='warn') Here y_true and y_pred are the required parameters.

How does it work in gridsearchcv?

It runs through all the different parameters that is fed into the parameter grid and produces the best combination of parameters, based on a scoring metric of your choice (accuracy, f1, etc). Obviously, nothing is perfect and GridSearchCV is no exception:

What is the F-1 score in multi-class classification?

Unlike binary classification, multi-class classification generates an F-1 score for each class separately. We’ll also explain how to compute an averaged F-1 score per classifier in Python, in case a single score is desired. 2. F-1 Score F-1 score is one of the common measures to rate how successful a classifier is.


1 Answers

You can create your own metric function with make_scorer. In this case, you can use sklearn's f1_score, but you can use your own if you prefer:

from sklearn.metrics import f1_score, make_scorer

f1 = make_scorer(f1_score , average='macro')


Once you have made your scorer, you can plug it directly inside the grid creation as scoring parameter:

clf = GridSearchCV(mlp, parameter_space, n_jobs= -1, cv = 3, scoring=f1)


On the other hand, I've used average='macro' as f1 multi-class parameter. This calculates the metrics for each label, and then finds their unweighted mean. But there are other options in order to compute f1 with multiple labels. You can find them here


Note: answer completely edited for better understanding

like image 91
Haritz Laboa Avatar answered Oct 13 '22 00:10

Haritz Laboa