Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does support vectors in SVM have alpha (Lagrangian multiplier) greater than zero?

I understood the overall SVM algorithm consisting of Lagrangian Duality and all, but I am not able to understand why particularly the Lagrangian multiplier is greater than zero for support vectors.

Thank you.

like image 719
Neel Shah Avatar asked Mar 07 '16 02:03

Neel Shah


People also ask

Do support vectors have a non zero value of alpha?

A support is actually a vector whose α is non-zero.

What is Alpha in Lagrange multiplier?

Lagrangian multiplier, usually denoted by α is a vector of the weights of all the training points as support vectors. Suppose there are m training examples. Then α is a vector of size m.

Why does SVM need to maximize the margin between support vectors?

SVM's way to find the best line This distance is called the margin. Our goal is to maximize the margin. The hyperplane for which the margin is maximum is the optimal hyperplane. Thus SVM tries to make a decision boundary in such a way that the separation between the two classes(that street) is as wide as possible.

What is the significance of support vectors in SVM explain with an example?

Support vectors are data points that are closer to the hyperplane and influence the position and orientation of the hyperplane. Using these support vectors, we maximize the margin of the classifier. Deleting the support vectors will change the position of the hyperplane. These are the points that help us build our SVM.


1 Answers

This might be a late answer but I am putting my understanding here for other visitors.

Lagrangian multiplier, usually denoted by α is a vector of the weights of all the training points as support vectors.

Suppose there are m training examples. Then α is a vector of size m. Now focus on any ith element of α: αi. It is clear that αi captures the weight of the ith training example as a support vector. Higher value of αi means that ith training example holds more importance as a support vector; something like if a prediction is to be made, then that ith training example will be more important in deriving the decision.

Now coming to the OP's concern:

I am not able to understand why particularly the Lagrangian multiplier is greater than zero for support vectors.

It is just a construct. When you say αi=0, it is just that ith training example has zero weight as a support vector. You can instead also say that that ith example is not a support vector.

Side note: One of the KKT's conditions is the complementary slackness: αigi(w)=0 for all i. For a support vector, it must lie on the margin which implies that gi(w)=0. Now αi can or cannot be zero; anyway it is satisfying the complementary slackness condition. For αi=0, you can choose whether you want to call such points a support vector or not based on the discussion given above. But for a non-support vector, αi must be zero for satisfying the complementary slackness as gi(w) is not zero.

like image 50
Ankit Shubham Avatar answered Oct 17 '22 19:10

Ankit Shubham