I would like to calculate AUC, precision, accuracy for my classifier. I am doing supervised learning:
Here is my working code. This code is working fine for binary class, but not for multi class. Please assume that you have a dataframe with binary classes:
sample_features_dataframe = self._get_sample_features_dataframe()
labeled_sample_features_dataframe = retrieve_labeled_sample_dataframe(sample_features_dataframe)
labeled_sample_features_dataframe, binary_class_series, multi_class_series = self._prepare_dataframe_for_learning(labeled_sample_features_dataframe)
k = 10
k_folds = StratifiedKFold(binary_class_series, k)
for train_indexes, test_indexes in k_folds:
train_set_dataframe = labeled_sample_features_dataframe.loc[train_indexes.tolist()]
test_set_dataframe = labeled_sample_features_dataframe.loc[test_indexes.tolist()]
train_class = binary_class_series[train_indexes]
test_class = binary_class_series[test_indexes]
selected_classifier = RandomForestClassifier(n_estimators=100)
selected_classifier.fit(train_set_dataframe, train_class)
predictions = selected_classifier.predict(test_set_dataframe)
predictions_proba = selected_classifier.predict_proba(test_set_dataframe)
roc += roc_auc_score(test_class, predictions_proba[:,1])
accuracy += accuracy_score(test_class, predictions)
recall += recall_score(test_class, predictions)
precision += precision_score(test_class, predictions)
In the end I divided the results in K of course for getting average AUC, precision, etc. This code is working fine. However, I cannot calculate the same for multi class:
train_class = multi_class_series[train_indexes]
test_class = multi_class_series[test_indexes]
selected_classifier = RandomForestClassifier(n_estimators=100)
selected_classifier.fit(train_set_dataframe, train_class)
predictions = selected_classifier.predict(test_set_dataframe)
predictions_proba = selected_classifier.predict_proba(test_set_dataframe)
I found that for multi class I have to add the parameter "weighted" for average.
roc += roc_auc_score(test_class, predictions_proba[:,1], average="weighted")
I got an error: raise ValueError("{0} format is not supported".format(y_type))
ValueError: multiclass format is not supported
How do AUC ROC plots work for multiclass models? For multiclass problems, ROC curves can be plotted with the methodology of using one class versus the rest. Use this one-versus-rest for each class and you will have the same number of curves as classes. The AUC score can also be calculated for each class individually.
Area under ROC for the multiclass problemroc_auc_score function can be used for multi-class classification. The multi-class One-vs-One scheme compares every unique pairwise combination of classes. In this section, we calculate the AUC using the OvR and OvO schemes.
You can't use roc_auc
as a single summary metric for multiclass models. If you want, you could calculate per-class roc_auc
, as
roc = {label: [] for label in multi_class_series.unique()}
for label in multi_class_series.unique():
selected_classifier.fit(train_set_dataframe, train_class == label)
predictions_proba = selected_classifier.predict_proba(test_set_dataframe)
roc[label] += roc_auc_score(test_class, predictions_proba[:,1])
However it's more usual to use sklearn.metrics.confusion_matrix
to evaluate the performance of a multiclass model.
The average
option of roc_auc_score
is only defined for multilabel problems.
You can take a look at the following example from the scikit-learn documentation to define you own micro- or macro-averaged scores for multiclass problems:
http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html#multiclass-settings
Edit: there is an issue on the scikit-learn tracker to implement ROC AUC for multiclass problems: https://github.com/scikit-learn/scikit-learn/issues/3298
As mentioned in here, to the best of my knowledge there is not yet a way to easily compute roc auc for multiple class settings natively in sklearn.
However, if you are familiar with classification_report
you may like this simple implementation that returns the same output as classification_report
as a pandas.DataFrame
which I personally found it very handy!:
import pandas as pd
import numpy as np
from scipy import interp
from sklearn.metrics import precision_recall_fscore_support
from sklearn.metrics import roc_curve, auc
from sklearn.preprocessing import LabelBinarizer
def class_report(y_true, y_pred, y_score=None, average='micro'):
if y_true.shape != y_pred.shape:
print("Error! y_true %s is not the same shape as y_pred %s" % (
y_true.shape,
y_pred.shape)
)
return
lb = LabelBinarizer()
if len(y_true.shape) == 1:
lb.fit(y_true)
#Value counts of predictions
labels, cnt = np.unique(
y_pred,
return_counts=True)
n_classes = len(labels)
pred_cnt = pd.Series(cnt, index=labels)
metrics_summary = precision_recall_fscore_support(
y_true=y_true,
y_pred=y_pred,
labels=labels)
avg = list(precision_recall_fscore_support(
y_true=y_true,
y_pred=y_pred,
average='weighted'))
metrics_sum_index = ['precision', 'recall', 'f1-score', 'support']
class_report_df = pd.DataFrame(
list(metrics_summary),
index=metrics_sum_index,
columns=labels)
support = class_report_df.loc['support']
total = support.sum()
class_report_df['avg / total'] = avg[:-1] + [total]
class_report_df = class_report_df.T
class_report_df['pred'] = pred_cnt
class_report_df['pred'].iloc[-1] = total
if not (y_score is None):
fpr = dict()
tpr = dict()
roc_auc = dict()
for label_it, label in enumerate(labels):
fpr[label], tpr[label], _ = roc_curve(
(y_true == label).astype(int),
y_score[:, label_it])
roc_auc[label] = auc(fpr[label], tpr[label])
if average == 'micro':
if n_classes <= 2:
fpr["avg / total"], tpr["avg / total"], _ = roc_curve(
lb.transform(y_true).ravel(),
y_score[:, 1].ravel())
else:
fpr["avg / total"], tpr["avg / total"], _ = roc_curve(
lb.transform(y_true).ravel(),
y_score.ravel())
roc_auc["avg / total"] = auc(
fpr["avg / total"],
tpr["avg / total"])
elif average == 'macro':
# First aggregate all false positive rates
all_fpr = np.unique(np.concatenate([
fpr[i] for i in labels]
))
# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in labels:
mean_tpr += interp(all_fpr, fpr[i], tpr[i])
# Finally average it and compute AUC
mean_tpr /= n_classes
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["avg / total"] = auc(fpr["macro"], tpr["macro"])
class_report_df['AUC'] = pd.Series(roc_auc)
return class_report_df
Here is some example:
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=5000, n_features=10,
n_informative=5, n_redundant=0,
n_classes=10, random_state=0,
shuffle=False)
X_train, X_test, y_train, y_test = train_test_split(X, y)
model = RandomForestClassifier(max_depth=2, random_state=0)
model.fit(X_train, y_train)
Regular classification_report
:
sk_report = classification_report(
digits=6,
y_true=y_test,
y_pred=model.predict(X_test))
print(sk_report)
Out:
precision recall f1-score support
0 0.262774 0.553846 0.356436 130
1 0.405405 0.333333 0.365854 135
2 0.367347 0.150000 0.213018 120
3 0.350993 0.424000 0.384058 125
4 0.379310 0.447154 0.410448 123
5 0.525000 0.182609 0.270968 115
6 0.362573 0.488189 0.416107 127
7 0.330189 0.299145 0.313901 117
8 0.328571 0.407080 0.363636 113
9 0.571429 0.248276 0.346154 145
avg / total 0.390833 0.354400 0.345438 1250
Custom classification_report:
report_with_auc = class_report(
y_true=y_test,
y_pred=model.predict(X_test),
y_score=model.predict_proba(X_test))
print(report_with_auc)
Out:
precision recall f1-score support pred AUC
0 0.262774 0.553846 0.356436 130.0 274.0 0.766477
1 0.405405 0.333333 0.365854 135.0 111.0 0.773974
2 0.367347 0.150000 0.213018 120.0 49.0 0.817341
3 0.350993 0.424000 0.384058 125.0 151.0 0.803364
4 0.379310 0.447154 0.410448 123.0 145.0 0.802436
5 0.525000 0.182609 0.270968 115.0 40.0 0.680870
6 0.362573 0.488189 0.416107 127.0 171.0 0.855768
7 0.330189 0.299145 0.313901 117.0 106.0 0.766526
8 0.328571 0.407080 0.363636 113.0 140.0 0.754812
9 0.571429 0.248276 0.346154 145.0 63.0 0.769100
avg / total 0.390833 0.354400 0.345438 1250.0 1250.0 0.776071
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With