Why Gaussian radial basis function maps the examples into an infinite-dimensional space?

Tags:

I've just run through the Wikipedia page about SVMs, and this line caught my eyes: "If the kernel used is a Gaussian radial basis function, the corresponding feature space is a Hilbert space of infinite dimensions." http://en.wikipedia.org/wiki/Support_vector_machine#Nonlinear_classification

In my understanding, if I apply Gaussian kernel in SVM, the resulting feature space will be m-dimensional (where m is the number of training samples), as you choose your landmarks to be your training examples, and you're measuring the "similarity" between a specific example and all the examples with the Gaussian kernel. As a consequence, for a single example you'll have as many similarity values as training examples. These are going to be the new feature vectors which are going to m-dimensional vectors, and not infinite dimensionals.

Could somebody explain to me what do I miss?

Thanks, Daniel

628

asked May 10 '14 13:05

PDani

2 Answers

The dual formulation of the linear SVM depends only on scalar products of all training vectors. Scalar product essentially measures similarity of two vectors. We can then generalize it by replacing with any other "well-behaved" (it should be positive-definite, it's needed to preserve convexity, as well as enables Mercer's theorem) similarity measure. And RBF is just one of them.

If you take a look at the formula here you'll see that RBF is basically a scalar product in a certain infinitely dimensional space

Proof

Thus RBF is kind of a union of polynomial kernels of all possible degrees.

187

answered Sep 21 '22 08:09

Artem Sobolev

The other answers are correct but don't really tell the right story here. Importantly, you are correct. If you have m distinct training points then the gaussian radial basis kernel makes the SVM operate in an m dimensional space. We say that the radial basis kernel maps to a space of infinite dimension because you can make m as large as you want and the space it operates in keeps growing without bound.

However, other kernels, like the polynomial kernel do not have this property of the dimensionality scaling with the number of training samples. For example, if you have 1000 2D training samples and you use a polynomial kernel of <x,y>^2 then the SVM will operate in a 3 dimensional space, not a 1000 dimensional space.

answered Sep 21 '22 08:09

Davis King

Related questions
                            
                                Weka: Results of each fold in 10-fold CV
                            
                                What is plurality classification in decision trees?
                            
                                Machine learning algorithms: which algorithm for which issue? [closed]
                            
                                Best way to extract keywords from input NLP sentence
                            
                                Keras neural network outputs same result for every input
                            
                                How can I print the Learning Rate at each epoch with Adam optimizer in Keras?
                            
                                Relationship between loss and accuracy
                            
                                python tsne.transform does not exist?
                            
                                Attach a queue to a numpy array in tensorflow for data fetch instead of files?
                            
                                tflearn / tensorflow does not learn xor
                            
                                Stratified splitting of pandas dataframe into training, validation and test set
                            
                                How to do zero padding in keras conv layer?
                            
                                MobileNet vs SqueezeNet vs ResNet50 vs Inception v3 vs VGG16
                            
                                Implementing custom loss function in scikit learn
                            
                                what is Gridsearch.cv_results_ , could any explain all the things in that i.e mean_test_score etc .?
                            
                                How to transform items using sklearn Pipeline?
                            
                                How to balance classification using DecisionTreeClassifier?
                            
                                Naive Bayes without Naive assumption
                            
                                NotFittedError: TfidfVectorizer - Vocabulary wasn't fitted
                            
                                Multiprocessing scikit-learn

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why Gaussian radial basis function maps the examples into an infinite-dimensional space?

Tags:

machine-learning

classification

svm

supervised-learning

gaussian

PDani

People also ask

2 Answers

Artem Sobolev

Davis King

Recent Activity

Donate For Us