I've been searching the net for ~3 hours but I couldn't find a solution yet. I want to give a precomputed kernel to libsvm and classify a dataset, but:
How can I generate a precomputed kernel? (for example, what is the basic precomputed kernel for Iris data?)
In the libsvm documentation, it is stated that:
For precomputed kernels, the first element of each instance must be the ID. For example,
samples = [[1, 0, 0, 0, 0], [2, 0, 1, 0, 1], [3, 0, 0, 1, 1], [4, 0, 1, 1, 2]]
problem = svm_problem(labels, samples)
param = svm_parameter(kernel_type=PRECOMPUTED)
What is a ID? There's no further details on that. Can I assign ID's sequentially?
Any libsvm help and an example of precomputed kernels really appreciated.
The kernel function takes two vectors and gives a scalar, so you can think of a precomputed kernel as a nxn matrix of scalars. It's usually called the kernel matrix, or sometimes the Gram matrix. There are many different kernels, the simplest is the linear kernel (also known as the dot product):
SVM Kernel Functions These functions can be different types. For example linear, nonlinear, polynomial, radial basis function (RBF), and sigmoid. Introduce Kernel functions for sequence data, graphs, text, images, as well as vectors. The most used type of kernel function is RBF.
Linear Kernel is used when the data is Linearly separable, that is, it can be separated using a single Line. It is one of the most common kernels to be used. It is mostly used when there are a Large number of Features in a particular Data Set.
In machine learning, a “kernel” is usually used to refer to the kernel trick, a method of using a linear classifier to solve a non-linear problem. It entails transforming linearly inseparable data like (Fig. 3) to linearly separable ones (Fig. 2).
scikit-learn hides most of the details of libsvm when handling custom kernels. You can either just pass an arbitrary function as your kernel and it will compute the gram matrix for you or pass the precomputed Gram matrix of the kernel.
For the first one, the syntax is:
>>> from scikits.learn import svm
>>> clf = svm.SVC(kernel=my_kernel)
where my_kernel is your kernel function, and then you can call clf.fit(X, y) and it will compute the kernel matrix for you. In the second case the syntax is:
>>> from scikits.learn import svm
>>> clf = svm.SVC(kernel="precomputed")
And when you call clf.fit(X, y), X must be the matrix k(X, X), where k is your kernel. See also this example for more details:
http://scikit-learn.org/stable/auto_examples/svm/plot_custom_kernel.html
First of all, some background to kernels and SVMs...
If you want to pre-compute a kernel for n
vectors (of any dimension), what need to do is calculate the kernel function between each pair of examples. The kernel function takes two vectors and gives a scalar, so you can think of a precomputed kernel as a nxn
matrix of scalars. It's usually called the kernel matrix, or sometimes the Gram matrix.
There are many different kernels, the simplest is the linear kernel (also known as the dot product):
sum(x_i * y_i) for i in [1..N] where (x_1,...,x_N) (y_1,..,y_N) are vectors
Secondly, trying to answer your problem...
The documentation about precomputed kernels in libsvm is actually pretty good...
Assume the original training data has three four-feature instances and testing data has one instance: 15 1:1 2:1 3:1 4:1 45 2:3 4:3 25 3:1 15 1:1 3:1 If the linear kernel is used, we have the following new training/testing sets: 15 0:1 1:4 2:6 3:1 45 0:2 1:6 2:18 3:0 25 0:3 1:1 2:0 3:1 15 0:? 1:2 2:0 3:1
Each vector here in the second example is a row in the kernel matrix. The value at index zero is the ID value and it just seems to be a sequential count. The value at index 1 of the first vector is the value of the kernel function of the first vector from the first example with itself (i.e. (1x1)+(1x1)+(1x1)+(1x1) = 4
), the second is the value of the kernel function of the first vector with the second (i.e. (1x3)+(1x3)=6
). It follows on like that for the rest of the example. You can see in that the kernel matrix is symmetric, as it should be, because K(x,y) = K(y,x).
It's worth pointing out that the first set of vectors are represented in a sparse format (i.e. missing values are zero), but the kernel matrix isn't and shouldn't be sparse. I don't know why that is, it just seems to be a libsvm thing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With