Computing AUC and ROC curve from multi-class data in scikit-learn (sklearn)?

Tags:

I am trying to use the scikit-learn module to compute AUC and plot ROC curves for the output of three different classifiers to compare their performance. I am very new to this topic, and I am struggling to understand how the data I have should input to the roc_curve and auc functions.

For each item within the testing set, I have the true value and the output of each of the three classifiers. The classes are ['N', 'L', 'W', 'T']. In addition, I have a confidence score for each value output from the classifiers. How do I pass this information to the roc_curve function?

Do I need to label_binarize my input data? How do I convert a list of [class, confidence] pairs output by the classifiers into the y_score expected by roc_curve?

Thank you for any help! Good resources about ROC curves would also be helpful.

679

asked Nov 05 '15 15:11

Suriname0

1 Answers

You need to use label_binarize function and then you can plot a multi-class ROC.

Example using Iris data:

import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize
from sklearn.metrics import roc_curve, auc
from sklearn.multiclass import OneVsRestClassifier
from itertools import cycle
plt.style.use('ggplot')

iris = datasets.load_iris()
X = iris.data
y = iris.target

# Binarize the output
y = label_binarize(y, classes=[0, 1, 2])
n_classes = y.shape[1]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0)

classifier = OneVsRestClassifier(svm.SVC(kernel='linear', probability=True,
                                 random_state=0))
y_score = classifier.fit(X_train, y_train).decision_function(X_test)

fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
colors = cycle(['blue', 'red', 'green'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, lw=1.5,
             label='ROC curve of class {0} (area = {1:0.2f})'
             ''.format(i, roc_auc[i]))
plt.plot([0, 1], [0, 1], 'k--', lw=1.5)
plt.xlim([-0.05, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic for multi-class data')
plt.legend(loc="lower right")
plt.show()

enter image description here

197

answered Oct 21 '22 09:10

seralouk

Related questions
                            
                                python, how to know the week number in the year of the day, Saturday as the first day of the week
                            
                                Python 3.4 SSL error urlopen error EOF occurred in violation of protocol (_ssl.c:600)
                            
                                How to get arrow heads tip to start/end at specified coordinates in Python?
                            
                                Abstract base class: raise NotImplementedError() in `__init__.py`?
                            
                                Define name for column func.count in sqlalchemy
                            
                                How to set Bernoulli distribution parameters in pymc3
                            
                                Visual Studio Code - input function in Python
                            
                                Numpy: How to create a grid-like array?
                            
                                Push Notification in DRF
                            
                                OSCAR_SEARCH_FACETS for filtering product lists
                            
                                How to set the alpha value for matplotlib plots globally
                            
                                Numpy and matplotlib garbage collection
                            
                                how to make "python setup.py install" install source instead of egg file?
                            
                                Why does installation of some Python packages require Visual Studio?
                            
                                Convert from CMYK to RGB
                            
                                Update all pip packages that don't come from conda
                            
                                404 page not found using Django + react-router
                            
                                How to check if a MySQL connection is open in Python?
                            
                                Python partial equivalent in Javascript / jQuery
                            
                                Obtaining the first few rows of a dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Computing AUC and ROC curve from multi-class data in scikit-learn (sklearn)?

Tags:

python

machine-learning

scikit-learn

roc

auc

Suriname0

People also ask

1 Answers

seralouk

Recent Activity

Donate For Us