Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SVM and Neural Network

What is difference between SVM and Neural Network? Is it true that linear svm is same NN, and for non-linear separable problems, NN uses adding hidden layers and SVM uses changing space dimensions?

like image 895
CoyBit Avatar asked Jan 22 '12 19:01

CoyBit


People also ask

Does SVM use neural network?

An SVM is a non-parametric classifier that finds a linear vector (if a linear kernel is used) to separate classes. Actually, in terms of the model performance, SVMs are sometimes equivalent to a shallow neural network architecture.

Is SVM the same as neural network?

An SVM possesses a number of parameters that increase linearly with the linear increase in the size of the input. A NN, on the other hand, doesn't. Even though here we focused especially on single-layer networks, a neural network can have as many layers as we want.

Is SVM faster than neural network?

We also noted that prediction time for neural networks is generally faster than that of SVMs. If you have a few years of experience in Computer Science or research, and you're interested in sharing that experience with the community, have a look at our Contribution Guidelines.

Is SVM used in deep learning?

Deep learning and SVM are different techniques. But thinking SVM as deep learning has misconceptions too. They can not be same but can be used together. Deep learning is more powerfull classifier than SVM.


2 Answers

There are two parts to this question. The first part is "what is the form of function learned by these methods?" For NN and SVM this is typically the same. For example, a single hidden layer neural network uses exactly the same form of model as an SVM. That is:

Given an input vector x, the output is: output(x) = sum_over_all_i weight_i * nonlinear_function_i(x)

Generally the nonlinear functions will also have some parameters. So these methods need to learn how many nonlinear functions should be used, what their parameters are, and what the value of all the weight_i weights should be.

Therefore, the difference between a SVM and a NN is in how they decide what these parameters should be set to. Usually when someone says they are using a neural network they mean they are trying to find the parameters which minimize the mean squared prediction error with respect to a set of training examples. They will also almost always be using the stochastic gradient descent optimization algorithm to do this. SVM's on the other hand try to minimize both training error and some measure of "hypothesis complexity". So they will find a set of parameters that fits the data but also is "simple" in some sense. You can think of it like Occam's razor for machine learning. The most common optimization algorithm used with SVMs is sequential minimal optimization.

Another big difference between the two methods is that stochastic gradient descent isn't guaranteed to find the optimal set of parameters when used the way NN implementations employ it. However, any decent SVM implementation is going to find the optimal set of parameters. People like to say that neural networks get stuck in a local minima while SVMs don't.

like image 131
Davis King Avatar answered Sep 22 '22 23:09

Davis King


NNs are heuristic, while SVMs are theoretically founded. A SVM is guaranteed to converge towards the best solution in the PAC (probably approximately correct) sense. For example, for two linearly separable classes SVM will draw the separating hyperplane directly halfway between the nearest points of the two classes (these become support vectors). A neural network would draw any line which separates the samples, which is correct for the training set, but might not have the best generalization properties.

So no, even for linearly separable problems NNs and SVMs are not same.

In case of linearly non-separable classes, both SVMs and NNs apply non-linear projection into higher-dimensional space. In the case of NNs this is achieved by introducing additional neurons in the hidden layer(s). For SVMs, a kernel function is used to the same effect. A neat property of the kernel function is that the computational complexity doesn't rise with the number of dimensions, while for NNs it obviously rises with the number of neurons.

like image 38
Igor F. Avatar answered Sep 21 '22 23:09

Igor F.