How to optimize precision-recall curve instead of AUC-ROC curve in python scikit-learn?

Question

I am asking a follow-up question as suggested from my previous post - Good ROC curve but poor precision-recall curve. I am only using the default setting with Python scikit-learn. It seems like the optimization is on AUC-ROC, but I am more interested in optimizing precision-recall. The following is my codes.

# Get ROC 
y_score = classifierUsed2.decision_function(X_test)
false_positive_rate, true_positive_rate, thresholds = roc_curve(y_test, y_score)
roc_auc = auc(false_positive_rate, true_positive_rate)
print 'AUC-'+ethnicity_tar+'=',roc_auc
# Plotting
ax1.plot(false_positive_rate, true_positive_rate, c=color, label=('AUC-'+ethnicity_tar+'= %0.2f'%roc_auc))
ax1.plot([0,1],[0,1], color='lightgrey', linestyle='--')
ax1.legend(loc='lower right', prop={'size':8})

# Get P-R pairs
precision, recall, prThreshold = precision_recall_curve(y_test, y_score)
# Plotting
ax2.plot(recall, precision, c=color, label=ethnicity_tar)
ax2.legend(loc='upper right', prop={'size':8})

Where and how do I insert python codes to change the setting so I can optimize the precision-recall?

David Dale · Accepted Answer

There are in fact two questions in your one:

How to evaluate how good a precision-recall curve is in a single number?
How to build a model as to maximize this number?

I will answer them in turn:

1. The measure of quality of precision-recall curve is average precision. This average precision equals the exact area under not-interpolated (that is, piecewise constant) precision-recall curve.

2. To maximize average precision, you can only tune hyperparameters of your algorithm. You can do it with GridSearchCV, if you set scoring='average_precision'. Or you can find optimal hyperparameters manually or with some other tuning technique.

This is generally impossible to optimize average precision directly (during the model fitting), but there are some exceptions. E.g. this article describes an SVM that maximizes average precision.

How to optimize precision-recall curve instead of AUC-ROC curve in python scikit-learn?

Tags:

machine-learning

python-2.7

scikit-learn

roc

precision-recall

KubiK888

1 Answers

David Dale

Recent Activity

Donate For Us

How to optimize precision-recall curve instead of AUC-ROC curve in python scikit-learn?

Tags:

machine-learning

python-2.7

scikit-learn

roc

precision-recall

KubiK888

1 Answers

David Dale

Related questions

Recent Activity

Donate For Us