I am using libsvm with precomputed kernels. I generated a precomputed kernel file for the example data set heart_scale and executed the function svmtrain()
. It worked properly and the support vectors were identifed correctly, i.e. similar to standard kernels.
However, when I am trying to run svmpredict()
, it gave different results for the precomputed model file. After digging through the code, I noticed that the svm_predict_values()
function, requires the actual features of the support vectors, which is unavailable in precomputed mode. In precomputed mode, we only have the coefficient and index of each support vector, which is mistaken for its features by svmpredict()
.
Is this a issue or am I missing something.
(Please let me know how to run svmpredict()
in precomputed mode.)
The values of the kernel evaluation between a test set vector, x, and each training set vector should be used as the test set feature vector.
Here are the pertinent lines from the libsvm readme:
New training instance for xi:
<label> 0:i 1:K(xi,x1) ... L:K(xi,xL)New testing instance for any x:
<label> 0:? 1:K(x,x1) ... L:K(x,xL)
The libsvm readme is saying that if you have L training set vectors, where xi is a training set vector with i from [1..L], and a test set vector, x, then the feature vector for x should be
<label of x> 0:<any number> 1:K(x^{test},x1^{train}), 2:K(x^{test},x2^{train}) ... L:K(x^{test},xL^{train})
where K(u,v) is used to denote the output of the kernel function on with vectors u and v as the arguments.
I have included some example python code below.
The results from the original feature vector representation and the precomputed (linear) kernel are not exactly the same, but this is probably due to differences in the optimization algorithm.
from svmutil import *
import numpy as np
#original example
y, x = svm_read_problem('.../heart_scale')
m = svm_train(y[:200], x[:200], '-c 4')
p_label, p_acc, p_val = svm_predict(y[200:], x[200:], m)
##############
#train the SVM using a precomputed linear kernel
#create dense data
max_key=np.max([np.max(v.keys()) for v in x])
arr=np.zeros( (len(x),max_key) )
for row,vec in enumerate(x):
for k,v in vec.iteritems():
arr[row][k-1]=v
x=arr
#create a linear kernel matrix with the training data
K_train=np.zeros( (200,201) )
K_train[:,1:]=np.dot(x[:200],x[:200].T)
K_train[:,:1]=np.arange(200)[:,np.newaxis]+1
m = svm_train(y[:200], [list(row) for row in K_train], '-c 4 -t 4')
#create a linear kernel matrix for the test data
K_test=np.zeros( (len(x)-200,201) )
K_test[:,1:]=np.dot(x[200:],x[:200].T)
K_test[:,:1]=np.arange(len(x)-200)[:,np.newaxis]+1
p_label, p_acc, p_val = svm_predict(y[200:],[list(row) for row in K_test], m)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With