Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SVM versus MLP (Neural Network): compared by performance and prediction accuracy

I should decide between SVM and neural networks for some image processing application. The classifier must be fast enough for near-real-time application and accuracy is important too. Since this is a medical application, it is important that the classifier has the low failure rate.

which one is better choice?

like image 797
Lily Avatar asked May 20 '12 09:05

Lily


People also ask

Is SVM better than MLP?

Why does a SVM model work better on this data than a Multilayer perceptron? By default, SVM's usually have higher prediction accuracy than a multilayer perceptron. SVM's might have higher runtime as there are calculations it performs that are advanced like translating n-dimensional space using kernel functions.

Is SVM better than neural networks?

What's more important, though, is that they both perform with comparable accuracy against the same dataset, if given comparable training. If given as much training and computational power as possible, however, NNs tend to outperform SVMs.

Why SVM gives high accuracy?

The SVM, in this example, uses 100% of the observations as support vectors. As it does so, it reaches maximum accuracy, whichever metric we want to use to assess it. The number of support vectors can however not be any lower than 2, and therefore this quantity does not appear problematic.

Is Perceptron faster than SVM?

Perceptron learning algorithm works better with linear data, but not better than SVM algorithm.


1 Answers

A couple of provisos:

performance of a ML classifier can refer to either (i) performance of the classifier itself; or (ii) performance of the predicate step: execution speed of the model-building algorithm. Particularly in this case, the answer is quite different depending on which of the two is intended in the OP, so i'll answer each separately.

second, by Neural Network, i'll assume you're referring to the most common implementation--i.e., a feed-forward, back-propagating single-hidden-layer perceptron.

Training Time (execution speed of the model builder)

For SVM compared to NN: SVMs are much slower. There is a straightforward reason for this: SVM training requires solving the associated Lagrangian dual (rather than primal) problem. This is a quadratic optimization problem in which the number of variables is very large--i.e., equal to the number of training instances (the 'length' of your data matrix).

In practice, two factors, if present in your scenario, could nullify this advantage:

  • NN training is trivial to parallelize (via map reduce); parallelizing SVM training is not trivial, but it's also not impossible--within the past eight or so years, several implementations have been published and proven to work (https://bibliographie.uni-tuebingen.de/xmlui/bitstream/handle/10900/49015/pdf/tech_21.pdf)

  • mult-class classification problem SVMs are two-class classifiers.They can be adapted for multi-class problems, but this is never straightforward because SVMs use direct decision functions. (An excellent source for modifying SVMs to multi-class problems is S. Abe, Support Vector Machines for Pattern Classification, Springer, 2005). This modification could wipe out any performance advantage SVMs have over NNs: So for instance, if your data has more than two classes and you chose to configure the SVM using successive classificstaion (aka one-against-many classification) in which data is fed to a first SVM classifier which classifiers the data point either class I or other; if the class is other then the data point is fed to a second classifier which classifies it class II or other, etc.

Prediction Performance (execution speed of the model)

Performance of an SVM is substantially higher compared to NN. For a three-layer (one hidden-layer) NN, prediction requires successive multiplication of an input vector by two 2D matrices (the weight matrices). For SVM, classification involves determining on which side of the decision boundary a given point lies, in other words a cosine product.

Prediction Accuracy

By "failure rate" i assume you mean error rate rather than failure of the classifier in production use. If the latter, then there is very little if any difference between SVM and NN--both models are generally numerically stable.

Comparing prediction accuracy of the two models, and assuming both are competently configured and trained, the SVM will outperform the NN.

The superior resolution of SVM versus NN is well documented in the scientific literature. It is true that such a comparison depends on the data, the configuration, and parameter choice of the two models. In fact, this comparison has been so widely studied--over perhaps all conceivable parameter space--and the results so consistent, that even the existence of a few exceptions (though i'm not aware of any) under impractical circumstances shouldn't interfere with the conclusion that SVMs outperform NNs.

Why does SVM outperform NN?

These two models are based on fundamentally different learing strategies.

In NN, network weights (the NN's fitting parameters, adjusted during training) are adjusted such that the sum-of-square error between the network output and the actual value (target) is minimized.

Training an SVM, by contrast, means an explicit determination of the decision boundaries directly from the training data. This is of course required as the predicate step to the optimization problem required to build an SVM model: minimizing the aggregate distance between the maximum-margin hyperplane and the support vectors.

In practice though it is harder to configure the algorithm to train an SVM. The reason is due to the large (compared to NN) number of parameters required for configuration:

  • choice of kernel

  • selection of kernel parameters

  • selection of the value of the margin parameter

like image 143
doug Avatar answered Sep 17 '22 17:09

doug