SVM versus MLP (Neural Network): compared by performance and prediction accuracy

Tags:

I should decide between SVM and neural networks for some image processing application. The classifier must be fast enough for near-real-time application and accuracy is important too. Since this is a medical application, it is important that the classifier has the low failure rate.

which one is better choice?

797

asked May 20 '12 09:05

Lily

1 Answers

A couple of provisos:

performance of a ML classifier can refer to either (i) performance of the classifier itself; or (ii) performance of the predicate step: execution speed of the model-building algorithm. Particularly in this case, the answer is quite different depending on which of the two is intended in the OP, so i'll answer each separately.

second, by Neural Network, i'll assume you're referring to the most common implementation--i.e., a feed-forward, back-propagating single-hidden-layer perceptron.

Training Time (execution speed of the model builder)

For SVM compared to NN: SVMs are much slower. There is a straightforward reason for this: SVM training requires solving the associated Lagrangian dual (rather than primal) problem. This is a quadratic optimization problem in which the number of variables is very large--i.e., equal to the number of training instances (the 'length' of your data matrix).

In practice, two factors, if present in your scenario, could nullify this advantage:

NN training is trivial to parallelize (via map reduce); parallelizing SVM training is not trivial, but it's also not impossible--within the past eight or so years, several implementations have been published and proven to work (https://bibliographie.uni-tuebingen.de/xmlui/bitstream/handle/10900/49015/pdf/tech_21.pdf)
mult-class classification problem SVMs are two-class classifiers.They can be adapted for multi-class problems, but this is never straightforward because SVMs use direct decision functions. (An excellent source for modifying SVMs to multi-class problems is S. Abe, Support Vector Machines for Pattern Classification, Springer, 2005). This modification could wipe out any performance advantage SVMs have over NNs: So for instance, if your data has more than two classes and you chose to configure the SVM using successive classificstaion (aka one-against-many classification) in which data is fed to a first SVM classifier which classifiers the data point either class I or other; if the class is other then the data point is fed to a second classifier which classifies it class II or other, etc.

Prediction Performance (execution speed of the model)

Performance of an SVM is substantially higher compared to NN. For a three-layer (one hidden-layer) NN, prediction requires successive multiplication of an input vector by two 2D matrices (the weight matrices). For SVM, classification involves determining on which side of the decision boundary a given point lies, in other words a cosine product.

Prediction Accuracy

By "failure rate" i assume you mean error rate rather than failure of the classifier in production use. If the latter, then there is very little if any difference between SVM and NN--both models are generally numerically stable.

Comparing prediction accuracy of the two models, and assuming both are competently configured and trained, the SVM will outperform the NN.

The superior resolution of SVM versus NN is well documented in the scientific literature. It is true that such a comparison depends on the data, the configuration, and parameter choice of the two models. In fact, this comparison has been so widely studied--over perhaps all conceivable parameter space--and the results so consistent, that even the existence of a few exceptions (though i'm not aware of any) under impractical circumstances shouldn't interfere with the conclusion that SVMs outperform NNs.

Why does SVM outperform NN?

These two models are based on fundamentally different learing strategies.

In NN, network weights (the NN's fitting parameters, adjusted during training) are adjusted such that the sum-of-square error between the network output and the actual value (target) is minimized.

Training an SVM, by contrast, means an explicit determination of the decision boundaries directly from the training data. This is of course required as the predicate step to the optimization problem required to build an SVM model: minimizing the aggregate distance between the maximum-margin hyperplane and the support vectors.

In practice though it is harder to configure the algorithm to train an SVM. The reason is due to the large (compared to NN) number of parameters required for configuration:

choice of kernel
selection of kernel parameters
selection of the value of the margin parameter

143

answered Sep 17 '22 17:09

doug

Related questions
                            
                                python tsne.transform does not exist?
                            
                                Attach a queue to a numpy array in tensorflow for data fetch instead of files?
                            
                                tflearn / tensorflow does not learn xor
                            
                                Stratified splitting of pandas dataframe into training, validation and test set
                            
                                How to do zero padding in keras conv layer?
                            
                                MobileNet vs SqueezeNet vs ResNet50 vs Inception v3 vs VGG16
                            
                                Implementing custom loss function in scikit learn
                            
                                what is Gridsearch.cv_results_ , could any explain all the things in that i.e mean_test_score etc .?
                            
                                How to transform items using sklearn Pipeline?
                            
                                How to balance classification using DecisionTreeClassifier?
                            
                                Naive Bayes without Naive assumption
                            
                                NotFittedError: TfidfVectorizer - Vocabulary wasn't fitted
                            
                                Multiprocessing scikit-learn
                            
                                Why Gaussian radial basis function maps the examples into an infinite-dimensional space?
                            
                                TypeError: __call__() missing 1 required positional argument: 'inputs'
                            
                                Batch gradient descent with scikit learn (sklearn)
                            
                                AttributeError: 'GridSearchCV' object has no attribute 'cv_results_'
                            
                                R: using ranger with caret, tuneGrid argument
                            
                                predicting class for new data using neuralnet
                            
                                Machine learning - Linear regression using batch gradient descent

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SVM versus MLP (Neural Network): compared by performance and prediction accuracy

Tags:

machine-learning

neural-network

deep-learning

svm

Lily

People also ask

1 Answers

doug

Recent Activity

Donate For Us