Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to draw the hyperplanes for SVM One-Versus-All?

I was trying to draw the hyperplanes when SVM-OVA was performed as following:

import matplotlib.pyplot as plt
import numpy as np
from sklearn.svm import SVC
x = np.array([[1,1.1],[1,2],[2,1]])
y = np.array([0,100,250])
classifier = OneVsRestClassifier(SVC(kernel='linear'))

Based on the answer to this question Plot hyperplane Linear SVM python, I wrote the following code:

fig, ax = plt.subplots()
# create a mesh to plot in
x_min, x_max = x[:, 0].min() - 1, x[:, 0].max() + 1
y_min, y_max = x[:, 1].min() - 1, x[:, 1].max() + 1
xx2, yy2 = np.meshgrid(np.arange(x_min, x_max, .2),np.arange(y_min, y_max, .2))
Z = classifier.predict(np.c_[xx2.ravel(), yy2.ravel()])
Z = Z.reshape(xx2.shape)
ax.contourf(xx2, yy2, Z, cmap=plt.cm.winter, alpha=0.3)
ax.scatter(x[:, 0], x[:, 1], c=y, cmap=plt.cm.winter, s=25)

# First line: class1 vs (class2 U class3)
w = classifier.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(-5, 5)
yy = a * xx - (classifier.intercept_[0]) / w[1]
ax.plot(xx,yy)

# Second line: class2 vs (class1 U class3)
w = classifier.coef_[1]
a = -w[0] / w[1]
xx = np.linspace(-5, 5)
yy = a * xx - (classifier.intercept_[1]) / w[1]
ax.plot(xx,yy)

# Third line: class 3 vs (class2 U class1)
w = classifier.coef_[2]
a = -w[0] / w[1]
xx = np.linspace(-5, 5)
yy = a * xx - (classifier.intercept_[2]) / w[1]
ax.plot(xx,yy)

However, this is what I obtained:

enter image description here

The lines are clearly wrong: actually, the angular coefficients seem correct, but not the intercepts. In particular, the orange line would be correct if translated by 0.5 down, the green one if translated by 0.5 left and the blue one if translated by 1.5 up.

Am I wrong to draw the lines, or the classifier does not work correctly because of the few training points?

like image 922
Alessandro Avatar asked Nov 26 '20 11:11

Alessandro


People also ask

What is || w || in SVM?

The objective of SVM is to maximise the width of the separation gap. That means to maximise 2/||W|| which is same as minimising ||W|| which is same as minimising ||W||² and which is same as minimising (1/2)||W||² and the same thing can be written as (1/2)WᵗW.

How do you choose the maximum margin hyperplane?

3.1. Here, the maximum-margin hyperplane is obtained that divides the group point for which = 1 from the group of points, such that the distance between the hyperplane and the nearest point from either group is maximized. A hyperplane separates the two classes of data, to increase the distance between them.

What is the optimal hyperplane for a SVM model?

The hyperplane for which the margin is maximum is the optimal hyperplane. Thus SVM tries to make a decision boundary in such a way that the separation between the two classes(that street) is as wide as possible.

How do you select hyperplane in SVM?

The best hyperplane is that plane that has the maximum distance from both the classes, and this is the main aim of SVM. This is done by finding different hyperplanes which classify the labels in the best way then it will choose the one which is farthest from the data points or the one which has a maximum margin.


1 Answers

The problem is the C parameter of SVC is too small (by default 1.0). According to this post,

Conversely, a very small value of C will cause the optimizer to look for a larger-margin separating hyperplane, even if that hyperplane misclassifies more points.

Therefore, the solution is to use a much larger C, for example 1e5

import matplotlib.pyplot as plt
import numpy as np
from sklearn.svm import SVC
from sklearn.multiclass import OneVsRestClassifier


x = np.array([[1,1.1],[1,2],[2,1]])
y = np.array([0,100,250])
classifier = OneVsRestClassifier(SVC(C=1e5,kernel='linear'))
classifier.fit(x,y)

fig, ax = plt.subplots()
# create a mesh to plot in
x_min, x_max = x[:, 0].min() - 1, x[:, 0].max() + 1
y_min, y_max = x[:, 1].min() - 1, x[:, 1].max() + 1
xx2, yy2 = np.meshgrid(np.arange(x_min, x_max, .2),np.arange(y_min, y_max, .2))
Z = classifier.predict(np.c_[xx2.ravel(), yy2.ravel()])
Z = Z.reshape(xx2.shape)
ax.contourf(xx2, yy2, Z, cmap=plt.cm.winter, alpha=0.3)
ax.scatter(x[:, 0], x[:, 1], c=y, cmap=plt.cm.winter, s=25)

def reconstruct(w,b):

    k = - w[0] / w[1]
    b = - b[0] / w[1]

    if k >= 0:
        x0 = max((y_min-b)/k,x_min)
        x1 = min((y_max-b)/k,x_max)
    else:
        x0 = max((y_max-b)/k,x_min)
        x1 = min((y_min-b)/k,x_max)
    if np.abs(x0) == np.inf: x0 = x_min
    if np.abs(x1) == np.inf: x1 = x_max
    
    xx = np.linspace(x0,x1)
    yy = k*xx+b

    return xx,yy

xx,yy = reconstruct(classifier.coef_[0],classifier.intercept_[0])
ax.plot(xx,yy,'r')
xx,yy = reconstruct(classifier.coef_[1],classifier.intercept_[1])
ax.plot(xx,yy,'g')
xx,yy = reconstruct(classifier.coef_[2],classifier.intercept_[2])
ax.plot(xx,yy,'b')

This time, because a much larger C is adopted, the result looks better

output

like image 135
meTchaikovsky Avatar answered Sep 28 '22 05:09

meTchaikovsky