Designing a Kernel for a support vector machine (XOR)

Tags:

The meat of my question is "how does one design a kernel function for a learning problem?"

As a quick background, I'm reading books on support vector machines and kernel machines, and everywhere I look authors give examples of kernels (polynomial kernels both homogeneous and nonhomogeneous, gaussian kernels, and allusions to text-based kernels to name a few), but all either provide pictures of the results without specifying the kernel, or vaguely claim that "an efficient kernel can be constructed". I'm interested in the process that goes on when one designs a kernel for a new problem.

Probably the easiest example is learning XOR, a smallest (4 points) non-linear data set as embedded the real plane. How would one come up with a natural (and non-trivial) kernel to linearly separate this data?

As a more complex example (see Cristianini, Introduction to SVMs, figure 6.2), how would one design a kernel to learn a checkerboard pattern? Cristianini states the picture was derived "using Gaussian kernels" but it seems that he uses multiple, and they are combined and modified in an unspecified way.

If this question is too broad to answer here, I'd appreciate a reference to the construction of one such kernel function, though I'd prefer the example be somewhat simple.

292

asked May 14 '11 00:05

JeremyKun

1 Answers

Q: "How does one design a kernel function for a learning problem?"

A: "Very carefully"

Trying the usual suspects (linear, polynomial, RBF) and using whichever works the best really is sound advice for someone trying to get the most accurate predictive model they can. For what it's worth it's a common criticism of SVMs that they seem to have a lot of parameters that you need to tune empirically. So at least you're not alone.

If you really want to design a kernel for a specific problem then you are right, it is a machine learning problem all in itself. It's called the 'model selection problem'. I'm not exactly an expert myself here, but the best source of insight into kernel methods for me was the book 'Gaussian Processes' by Rasumussen and Williams (it's freely available online), particularly chapters 4 and 5. I'm sorry that I can't say much more than 'read this huge book full of maths' but it's a complicated problem and they do a really good job of explaining it.

135

answered Oct 28 '22 00:10

Stompchicken

Related questions
                            
                                Cloning Android sources to a local repository server
                            
                                junit: impact of forkMode="once" on test correctness
                            
                                Why does AndroidTestCase.getContext().getApplicationContext() return null?
                            
                                SaveDefinitions considered dangerous
                            
                                How to upload a file from the browser to Amazon S3 with node.js, Express, and knox? [closed]
                            
                                How to debug binding in WPF
                            
                                NVIDIA Cuda error "all CUDA-capable devices are busy or unavailable" on OSX
                            
                                Removing logging with ProGuard doesn't remove the strings being logged [duplicate]
                            
                                Rails 3.1 strategy for pre-compiling controller specific JS assets
                            
                                What are ways to keep hashCode/equals consistent with the business definition of the class?
                            
                                Tracking mouse move in QGraphicsScene class
                            
                                Is there a preferred way to order floating-point operands?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With