sklearn roc_auc_score with multi_class=="ovr" should have None average available

Tags:

I'm trying to compute the AUC score for a multiclass problem using the sklearn's roc_auc_score() function.

I have prediction matrix of shape [n_samples,n_classes] and a ground truth vector of shape [n_samples], named np_pred and np_label respectively.

What I'm trying to achieve is the set of AUC scores, one for each classes that I have.

To do so I would like to use the average parameter option None and multi_class parameter set to "ovr", but if I run

roc_auc_score(y_score=np_pred, y_true=np_label, multi_class="ovr",average=None)

I get back

ValueError: average must be one of ('macro', 'weighted') for multiclass problems

This error is expected from the sklearn function in the case of the multiclass; but if you take a look at the roc_auc_score function source code, you can see that if the multi_class parameter is set to "ovr", and the average is one of the accepted one, the multiClass case is treated as a multiLabel one and the internal multiLabel function accepts None as average parameter.

So, by looking at the code, it seems that I should be able to execute a multiclass with a None average in a One vs Rest case but the ifs in the source code do not allow such combination.

Am I wrong?

In case I'm wrong, from a theoretical point of view should I fake a multilabel case just to have the different AUCs for the different classes or should I write my own function that cycles the different classes and outputs the AUCs?

Thanks

702

asked Jan 09 '20 14:01

Dario Mantegazza

1 Answers

As you already know, right now sklearn multiclass ROC AUC only handles the macro and weighted averages. But it can be implemented as it can then individually return the scores for each class.

Theoretically speaking, you could implement OVR and calculate per-class roc_auc_score, as:

roc = {label: [] for label in multi_class_series.unique()}
for label in multi_class_series.unique():
    selected_classifier.fit(train_set_dataframe, train_class == label)
    predictions_proba = selected_classifier.predict_proba(test_set_dataframe)
    roc[label] += roc_auc_score(test_class, predictions_proba[:,1])

121

answered Sep 23 '22 21:09

shaivikochar

Related questions
                            
                                Invert the y-axis of an image without flipping the image upside down
                            
                                Understanding CTC loss for speech recognition in Keras
                            
                                AWS Lambda-API gateway "message": "Internal server error" (502 Bad Gateway)
                            
                                PySimpleGUI file browser specific file type
                            
                                Where do you specify your API key when making a request with the Google API python library?
                            
                                How can I reduce the memory of a pandas DataFrame?
                            
                                How to perform the Search operation using Google S2 geometry
                            
                                Count strings in nested list
                            
                                How to remove margins from PDF? (Generated using WeasyPrint)
                            
                                How to implement sorting in Django Admin for calculated model properties without writing the logic twice?
                            
                                What are handlers in python in plain English
                            
                                How to get the relative path between two absolute paths in Python using pathlib?
                            
                                What is the type annotation for a Flask view?
                            
                                Create websocket connection from requests session in python
                            
                                Specifying *args for a Callable type hint
                            
                                what is best practice to control "too many local variable in a function" without suppress and manipulate pylint settings?
                            
                                Anaconda navigator only showing one python version
                            
                                How to get the p-value between two groups after groupby in pandas?
                            
                                How can I type-hint a nested object in Python?
                            
                                How can I use TransformedTargetRegressor in a GridSearchCV?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

sklearn roc_auc_score with multi_class=="ovr" should have None average available

Tags:

python

machine-learning

scikit-learn

auc

Dario Mantegazza

People also ask

1 Answers

shaivikochar

Recent Activity

Donate For Us