Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scikit Learn - How to plot probabilities

I want to plot the models prediction probabilities.

plt.scatter(y_test, prediction[:,0])
plt.xlabel("True Values")
plt.ylabel("Predictions")
plt.show()

Graph

However, I get a graph like the above. Which kind of makes sense but I want to visualize the probability distribution better. Is there a way I can do this with my actual classes being 0 or 1 and predictions between between 0 and 1.

like image 627
Kay Avatar asked Jan 30 '23 18:01

Kay


1 Answers

Predicted probabilities can be utilized to visualize the model performance. True labels could be indicated using colors.

Try this example:

from sklearn.datasets import make_classification
import matplotlib.pyplot as plt

X, y = make_classification(n_samples=1000, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=1, shuffle=False)
from sklearn.linear_model import LogisticRegression

lr=LogisticRegression(random_state=0, solver='lbfgs', max_iter=10)
lr.fit(X, y)

prediction=lr.predict_proba(X)[:,1]

plt.figure(figsize=(15,7))
plt.hist(prediction[y==0], bins=50, label='Negatives')
plt.hist(prediction[y==1], bins=50, label='Positives', alpha=0.7, color='r')
plt.xlabel('Probability of being Positive Class', fontsize=25)
plt.ylabel('Number of records in each bucket', fontsize=25)
plt.legend(fontsize=15)
plt.tick_params(axis='both', labelsize=25, pad=5)
plt.show() 

enter image description here

like image 73
Venkatachalam Avatar answered Feb 02 '23 10:02

Venkatachalam