Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Balanced_accuracy is not a valid scoring value in scikit-learn

super simliar to this post: ValueError: 'balanced_accuracy' is not a valid scoring value in scikit-learn

I am using:

scoring = ['precision_macro', 'recall_macro', 'balanced_accuracy_score']
clf = DecisionTreeClassifier(random_state=0)
scores = cross_validate(clf, X, y, scoring=scoring, cv=10, return_train_score=True)

And i receive the error:

ValueError: 'balanced_accuracy_score' is not a valid scoring value. Use sorted(sklearn.metrics.SCORERS.keys()) to get valid options.

I did the recommended solution and upgraded scikit (in the enviornment): enter image description here

When I check the possible scorers:

sklearn.metrics.SCORERS.keys()
dict_keys(['explained_variance', 'r2', 'max_error', 'neg_median_absolute_error', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_root_mean_squared_error', 'neg_mean_poisson_deviance', 'neg_mean_gamma_deviance', 'accuracy', 'roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'roc_auc_ovr_weighted', 'roc_auc_ovo_weighted', 'balanced_accuracy', 'average_precision', 'neg_log_loss', 'neg_brier_score', 'adjusted_rand_score', 'homogeneity_score', 'completeness_score', 'v_measure_score', 'mutual_info_score', 'adjusted_mutual_info_score', 'normalized_mutual_info_score', 'fowlkes_mallows_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'jaccard', 'jaccard_macro', 'jaccard_micro', 'jaccard_samples', 'jaccard_weighted'])

I can stil not find it? Where is the problem?

like image 876
PV8 Avatar asked Dec 17 '19 15:12

PV8


People also ask

What is balanced accuracy sklearn?

The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets. It is defined as the average of recall obtained on each class. The best value is 1 and the worst value is 0 when adjusted=False .

What is a good accuracy score sklearn?

The best performance is 1 with normalize == True and the number of samples with normalize == False . Compute the balanced accuracy to deal with imbalanced datasets.

What is scoring in sklearn?

Think of score as a shorthand to calculate accuracy since it is such a common metric. It is also implemented to avoid calculating accuracy like this which involves more steps: from sklearn.metrics import accuracy score preds = clf.predict(X_test) accuracy_score(y_test, preds)


1 Answers

According to the docs for valid scorers, the value of the scoring parameter corresponding to the balanced_accuracy_score scorer function is "balanced_accuracy" as in my other answer:

Change:

scoring = ['precision_macro', 'recall_macro', 'balanced_accuracy_score']

to:

scoring = ['precision_macro', 'recall_macro', 'balanced_accuracy']

and it should work.

I do find the documentation a bit lacking in this respect, and this convention of removing the _score suffix is not consistent either, as all the clustering metrics still have _score in their names in their scoring parameter values.

like image 143
Mihai Chelaru Avatar answered Oct 13 '22 19:10

Mihai Chelaru