I'm implementing logistic regression. I managed to get probabilities out of it, and am able to predict a 2 class classification task. My question is: For my final model, I have weights and the training data. There are 2 features, so my weight is a vector with 2 rows. How do I plot this? I saw this post, but I don't quite understand the answer. Do I need a contour plot?

An advantage of the logistic regression classifier is that once you fit it, you can get probabilities for any sample vector. That may be more interesting to plot. Here's an example using scikit-learn: <pre class="prettyprint"><code>import numpy as np from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification import matplotlib.pyplot as plt import seaborn as sns sns.set(style="white") </code></pre> First, generate the data and fit the classifier to the training set: <pre class="prettyprint"><code>X, y = make_classification(200, 2, 2, 0, weights=[.5, .5], random_state=15) clf = LogisticRegression().fit(X[:100], y[:100]) </code></pre> Next, make a continuous grid of values and evaluate the probability of each (x, y) point in the grid: <pre class="prettyprint"><code>xx, yy = np.mgrid[-5:5:.01, -5:5:.01] grid = np.c_[xx.ravel(), yy.ravel()] probs = clf.predict_proba(grid)[:, 1].reshape(xx.shape) </code></pre> Now, plot the probability grid as a contour map and additionally show the test set samples on top of it: <pre class="prettyprint"><code>f, ax = plt.subplots(figsize=(8, 6)) contour = ax.contourf(xx, yy, probs, 25, cmap="RdBu", vmin=0, vmax=1) ax_c = f.colorbar(contour) ax_c.set_label("$P(y = 1)$") ax_c.set_ticks([0, .25, .5, .75, 1]) ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50, cmap="RdBu", vmin=-.2, vmax=1.2, edgecolor="white", linewidth=1) ax.set(aspect="equal", xlim=(-5, 5), ylim=(-5, 5), xlabel="$X_1$", ylabel="$X_2$") </code></pre> <img src="https://i.stack.imgur.com/SGVbQ.png" alt="enter image description here"> The logistic regression lets your classify new samples based on any threshold you want, so it doesn't inherently have one "decision boundary." But, of course, a common decision rule to use is p = .5. We can also just draw that contour level using the above code: <pre class="prettyprint"><code>f, ax = plt.subplots(figsize=(8, 6)) ax.contour(xx, yy, probs, levels=[.5], cmap="Greys", vmin=0, vmax=.6) ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50, cmap="RdBu", vmin=-.2, vmax=1.2, edgecolor="white", linewidth=1) ax.set(aspect="equal", xlim=(-5, 5), ylim=(-5, 5), xlabel="$X_1$", ylabel="$X_2$") </code></pre> <img src="https://i.stack.imgur.com/D0eml.png" alt="enter image description here">

plotting decision boundary of logistic regression

1 Answers

An advantage of the logistic regression classifier is that once you fit it, you can get probabilities for any sample vector. That may be more interesting to plot. Here's an example using scikit-learn:

import numpy as np from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification import matplotlib.pyplot as plt import seaborn as sns sns.set(style="white")

First, generate the data and fit the classifier to the training set:

X, y = make_classification(200, 2, 2, 0, weights=[.5, .5], random_state=15) clf = LogisticRegression().fit(X[:100], y[:100])

Next, make a continuous grid of values and evaluate the probability of each (x, y) point in the grid:

xx, yy = np.mgrid[-5:5:.01, -5:5:.01] grid = np.c_[xx.ravel(), yy.ravel()] probs = clf.predict_proba(grid)[:, 1].reshape(xx.shape)

Now, plot the probability grid as a contour map and additionally show the test set samples on top of it:

f, ax = plt.subplots(figsize=(8, 6)) contour = ax.contourf(xx, yy, probs, 25, cmap="RdBu",                       vmin=0, vmax=1) ax_c = f.colorbar(contour) ax_c.set_label("$P(y = 1)$") ax_c.set_ticks([0, .25, .5, .75, 1])  ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50,            cmap="RdBu", vmin=-.2, vmax=1.2,            edgecolor="white", linewidth=1)  ax.set(aspect="equal",        xlim=(-5, 5), ylim=(-5, 5),        xlabel="$X_1$", ylabel="$X_2$")

enter image description here

The logistic regression lets your classify new samples based on any threshold you want, so it doesn't inherently have one "decision boundary." But, of course, a common decision rule to use is p = .5. We can also just draw that contour level using the above code:

f, ax = plt.subplots(figsize=(8, 6)) ax.contour(xx, yy, probs, levels=[.5], cmap="Greys", vmin=0, vmax=.6)  ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50,            cmap="RdBu", vmin=-.2, vmax=1.2,            edgecolor="white", linewidth=1)  ax.set(aspect="equal",        xlim=(-5, 5), ylim=(-5, 5),        xlabel="$X_1$", ylabel="$X_2$")

enter image description here

114

answered Sep 19 '22 13:09

mwaskom

Related questions
                            
                                Python matplotlib: memory not being released when specifying figure size
                            
                                what does axes.flat in matplotlib do?
                            
                                Matplotlib: TypeError: 'AxesSubplot' object is not subscriptable [duplicate]
                            
                                Animate a rotating 3D graph in matplotlib
                            
                                How do I set color to Rectangle in Matplotlib?
                            
                                AttributeError: Unknown property axisbg
                            
                                How to get center of set of points using Python
                            
                                Matplotlib: set axis tight only to x or y axis
                            
                                Matplotlib Errorbar Caps Missing
                            
                                Plot a histogram such that the total area of the histogram equals 1 (density)
                            
                                Python Seaborn jointplot does not show the correlation coefficient and p-value on the chart
                            
                                Function of Numpy Array with if-statement
                            
                                How to scale axes in mplot3d
                            
                                How can I place a table on a plot in Matplotlib?
                            
                                Why is matplotlib plotting my circles as ovals?
                            
                                Make contour of scatter
                            
                                Standalone colorbar (matplotlib)
                            
                                Saving Matplotlib graphs to image as full screen
                            
                                Plotting probability density function by sample with matplotlib [closed]
                            
                                Hide contour linestroke on pyplot.contourf to get only fills

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

plotting decision boundary of logistic regression

Tags:

matplotlib

scikit-learn

logistic-regression

user2773013

People also ask

1 Answers

mwaskom

Recent Activity

Donate For Us