Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Computing Lagrange Multiplers for a simple Support Vector Machine

Firstly, I am a beginner to Support Vector Machines so I'm sorry if I am going about this problem in the wrong way. I am trying to implement a very simple SVM from scratch which uses the identity kernel function to classify linearly separable data into one of two classes. As an example of the sort of data which I will be using, consider the plot below seen in this document:

Plotted linearly separable data

Using the points (1,0), (3, 1) and (3, -1) as support vectors, we know that the following is true with regards to calculating the decision plane (Screenshotted from the same document):

Lagrange Multipler Formula One Which when fiddled and rearranged a bit gives us Lagrange multipliers of -3.5, 0.75 and 0.75 respectively.

I understand how this algebra works on paper, however I am unsure as to the best approach when it comes to implementation. So my question is as follows: how are the SVM's Lagrange Multipliers calculated in practice? Is there an algorithm which I am missing which will be able to determine these values for arbitrary linearly separable support vectors? Should I use a standard maths library to solve the linear equations (I am implementing the SVM in java)? Would such a maths library be slow for large scale learning? Note that this is a learning exercise so I'm not just looking for a ready made SVM library.

Any other advice would be much appreciated!

EDIT 1: LutzL made a good point that half the problem is actually determining which points are to be used as the support vectors, so to keep things simple assume for the purpose of this question that they have already been computed.

like image 816
Hungry Avatar asked Feb 12 '15 17:02

Hungry


People also ask

What is a Lagrangian support vector machine?

The support vector machine is designed to discriminate data points belonging to two different classes. One set of points is labelled as +1 also called the positive class. The other set of points is labeled as -1 also called the negative class.

How do you use the Lagrange multiplier method?

Lagrange's Multipliers MethodLet (x0, y0, z0) ∈ S := {(x, y, z) : g(x, y, z) = 0} and ∇g(x0, y0, z0) ≠ 0. By solving these equations, we get the values of unknown variables, say x, y, z and λ. Thus, we will get the local extremum points through the solutions of the above set of equations.

Why do we try to maximize Lagrangian in SVM?

We want to maximize Lagrange by alpha. Because here Alpha is a penalty. When we do not fit the constraints, we are punishing the lagrange. So, maximizing L with alpha gives us more optimal minimum with respect to w and b.


1 Answers

Independent of the kernel function, the determination of the coefficients leads to a quadratic optimization problem with linear positivity constraints. Which has a horrendous complexity if implemented naively testing all boundary components, so you can not avoid advanced optimization algorithms like barrier or trust region methods.

There are also heuristic approaches that try to keep the optimization problem in low dimension by searching for point sets close to the separation line and eliminating points that are most probably far away from it.

like image 144
Lutz Lehmann Avatar answered Oct 01 '22 03:10

Lutz Lehmann