Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

plotting decision boundary of logistic regression

I'm implementing logistic regression. I managed to get probabilities out of it, and am able to predict a 2 class classification task.

My question is:

For my final model, I have weights and the training data. There are 2 features, so my weight is a vector with 2 rows.

How do I plot this? I saw this post, but I don't quite understand the answer. Do I need a contour plot?

like image 440
user2773013 Avatar asked Jan 31 '15 20:01

user2773013


People also ask

How do you plot a decision boundary for logistic regression?

For the gradient, m, consider two distinct points on the decision boundary, (xa1,xa2) and (xb1,xb2), so that m=(xb2−xa2)/(xb1−xa1). Along the boundary line, 0=w1xb1+w2xb2+b−(w1xa1+w2xa2+b)⇒−w2(xb2−xa2)=w1(xb1−xa1)⇒m=−w1w2.

How do you plot a decision boundary?

Single-Line Decision Boundary: The basic strategy to draw the Decision Boundary on a Scatter Plot is to find a single line that separates the data-points into regions signifying different classes.

What decision boundary can logistic regression provide?

The fundamental application of logistic regression is to determine a decision boundary for a binary classification problem. Although the baseline is to identify a binary decision boundary, the approach can be very well applied for scenarios with multiple classification classes or multi-class classification.

Does logistic regression have a linear decision boundary?

The decision boundary is a line or a plane that separates the target variables into different classes that can be either linear or nonlinear. In the case of a Logistic Regression model, the decision boundary is a straight line.


1 Answers

An advantage of the logistic regression classifier is that once you fit it, you can get probabilities for any sample vector. That may be more interesting to plot. Here's an example using scikit-learn:

import numpy as np from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification import matplotlib.pyplot as plt import seaborn as sns sns.set(style="white") 

First, generate the data and fit the classifier to the training set:

X, y = make_classification(200, 2, 2, 0, weights=[.5, .5], random_state=15) clf = LogisticRegression().fit(X[:100], y[:100]) 

Next, make a continuous grid of values and evaluate the probability of each (x, y) point in the grid:

xx, yy = np.mgrid[-5:5:.01, -5:5:.01] grid = np.c_[xx.ravel(), yy.ravel()] probs = clf.predict_proba(grid)[:, 1].reshape(xx.shape) 

Now, plot the probability grid as a contour map and additionally show the test set samples on top of it:

f, ax = plt.subplots(figsize=(8, 6)) contour = ax.contourf(xx, yy, probs, 25, cmap="RdBu",                       vmin=0, vmax=1) ax_c = f.colorbar(contour) ax_c.set_label("$P(y = 1)$") ax_c.set_ticks([0, .25, .5, .75, 1])  ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50,            cmap="RdBu", vmin=-.2, vmax=1.2,            edgecolor="white", linewidth=1)  ax.set(aspect="equal",        xlim=(-5, 5), ylim=(-5, 5),        xlabel="$X_1$", ylabel="$X_2$") 

enter image description here

The logistic regression lets your classify new samples based on any threshold you want, so it doesn't inherently have one "decision boundary." But, of course, a common decision rule to use is p = .5. We can also just draw that contour level using the above code:

f, ax = plt.subplots(figsize=(8, 6)) ax.contour(xx, yy, probs, levels=[.5], cmap="Greys", vmin=0, vmax=.6)  ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50,            cmap="RdBu", vmin=-.2, vmax=1.2,            edgecolor="white", linewidth=1)  ax.set(aspect="equal",        xlim=(-5, 5), ylim=(-5, 5),        xlabel="$X_1$", ylabel="$X_2$") 

enter image description here

like image 114
mwaskom Avatar answered Sep 19 '22 13:09

mwaskom