I have a very general question: how do I choose the right kernel function for SVM? I know the ultimate answer is try all the kernels, do out-of-sample validation, and pick the one with best classification result. But other than that, is there any guideline of trying the different kernel functions?
One possibility you might try is simulating Gaussian Processes with different kernels. In that way, you can get a feel for what the different kernels will produce. This can most easily be done by selecting a grid of values and simulating from the multivariate normal implied by that grid.
A function K(x,z) is a valid kernel if it corresponds to an inner product in some (perhaps infinite dimensional) feature space. your dot product will be operate using vectors in a space of dimensionality n(n+1)/2. The kernel trick allows you to save time/space and compute dot products in an n dimensional space.
The most straight forward test is based on the following: A kernel function is valid if and only if the kernel matrix for any particular set of data points has all non-negative eigenvalues. You can easily test this by taking a reasonably large set of data points and simply checking if it is true.
Always try the linear kernel first, simply because it's so much faster and can yield great results in many cases (specifically high dimensional problems).
If the linear kernel fails, in general your best bet is an RBF kernel. They are known to perform very well on a large variety of problems.
Look here to find the answer.
https://stats.stackexchange.com/questions/18030/how-to-select-kernel-for-svm
Basically, there is rather no one good path to choose, unless you know something important about your data that might determine proper kernel to use. However, follow the link above to get more specific information.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With