Precomputed Kernels with LibSVM in Python

Tags:

I've been searching the net for ~3 hours but I couldn't find a solution yet. I want to give a precomputed kernel to libsvm and classify a dataset, but:

How can I generate a precomputed kernel? (for example, what is the basic precomputed kernel for Iris data?)

In the libsvm documentation, it is stated that:

For precomputed kernels, the first element of each instance must be the ID. For example,

        samples = [[1, 0, 0, 0, 0], [2, 0, 1, 0, 1], [3, 0, 0, 1, 1], [4, 0, 1, 1, 2]]
        problem = svm_problem(labels, samples)
        param = svm_parameter(kernel_type=PRECOMPUTED)

What is a ID? There's no further details on that. Can I assign ID's sequentially?

Any libsvm help and an example of precomputed kernels really appreciated.

524

asked Mar 19 '10 01:03

Lyyli

2 Answers

scikit-learn hides most of the details of libsvm when handling custom kernels. You can either just pass an arbitrary function as your kernel and it will compute the gram matrix for you or pass the precomputed Gram matrix of the kernel.

For the first one, the syntax is:

   >>> from scikits.learn import svm
   >>> clf = svm.SVC(kernel=my_kernel)

where my_kernel is your kernel function, and then you can call clf.fit(X, y) and it will compute the kernel matrix for you. In the second case the syntax is:

   >>> from scikits.learn import svm
   >>> clf = svm.SVC(kernel="precomputed")

And when you call clf.fit(X, y), X must be the matrix k(X, X), where k is your kernel. See also this example for more details:

http://scikit-learn.org/stable/auto_examples/svm/plot_custom_kernel.html

answered Oct 19 '22 03:10

2 revs

First of all, some background to kernels and SVMs...

If you want to pre-compute a kernel for n vectors (of any dimension), what need to do is calculate the kernel function between each pair of examples. The kernel function takes two vectors and gives a scalar, so you can think of a precomputed kernel as a nxn matrix of scalars. It's usually called the kernel matrix, or sometimes the Gram matrix.

There are many different kernels, the simplest is the linear kernel (also known as the dot product):

sum(x_i * y_i) for i in [1..N] where (x_1,...,x_N) (y_1,..,y_N) are vectors

Secondly, trying to answer your problem...

The documentation about precomputed kernels in libsvm is actually pretty good...

Assume the original training data has three four-feature instances 
and testing data has one instance:

15  1:1 2:1 3:1 4:1
45      2:3     4:3
25          3:1
15  1:1     3:1

If the linear kernel is used, we have the following 
new training/testing sets:

15  0:1 1:4 2:6  3:1
45  0:2 1:6 2:18 3:0 
25  0:3 1:1 2:0  3:1

15  0:? 1:2 2:0  3:1

Each vector here in the second example is a row in the kernel matrix. The value at index zero is the ID value and it just seems to be a sequential count. The value at index 1 of the first vector is the value of the kernel function of the first vector from the first example with itself (i.e. (1x1)+(1x1)+(1x1)+(1x1) = 4), the second is the value of the kernel function of the first vector with the second (i.e. (1x3)+(1x3)=6). It follows on like that for the rest of the example. You can see in that the kernel matrix is symmetric, as it should be, because K(x,y) = K(y,x).

It's worth pointing out that the first set of vectors are represented in a sparse format (i.e. missing values are zero), but the kernel matrix isn't and shouldn't be sparse. I don't know why that is, it just seems to be a libsvm thing.

answered Oct 19 '22 04:10

Stompchicken

Related questions
                            
                                What does isinstance with a dictionary and abc.Mapping from collections doing?
                            
                                Python: TypeError: argument after * must be a sequence
                            
                                Identifying the range of a color in HSV using openCV
                            
                                Summing elements in a sliding window - NumPy
                            
                                How does one inspect variables in a checkpoint file in TensorFlow when TensorFlow can't find the tools attribute?
                            
                                Dumping numpy array into an excel file
                            
                                django, pyenv, uwsgi - ModuleNotFoundError: No module named 'django'
                            
                                "SystemError: tile cannot extend outside image" in PIL during save image
                            
                                Getting error when adding a new row to my existing dataframe in pandas
                            
                                Can I cut out the string in the Django template?
                            
                                pipenv install gives "pew is not in your PATH"
                            
                                pandas plot value counts barplot in descending manner [duplicate]
                            
                                Why is this singleton implementation "not thread safe"?
                            
                                How to find symmetric mean absolute error in python?
                            
                                Faster alternative to perform pandas groupby operation
                            
                                AttributeError: Could not find PyAudio; check installation...can't use speech Recognition
                            
                                Pandas Data Frame Filtering Multiple Conditions
                            
                                Producing documentation for Python classes [closed]
                            
                                Python PEP8 printing wrapped strings without indent
                            
                                How to pack a tkinter widget underneath an existing widget that has been packed to the left side?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Precomputed Kernels with LibSVM in Python

Tags:

python

machine-learning

libsvm

Lyyli

People also ask

2 Answers

2 revs

Stompchicken

Recent Activity

Donate For Us