Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bad result when using precomputed chi2 kernel with libsvm (matlab)

I am trying libsvm and I follow the example for training a svm on the heart_scale data which comes with the software. I want to use a chi2 kernel which I precompute myself. The classification rate on the training data drops to 24%. I am sure I compute the kernel correctly but I guess I must be doing something wrong. The code is below. Can you see any mistakes? Help would be greatly appreciated.

%read in the data:
[heart_scale_label, heart_scale_inst] = libsvmread('heart_scale');
train_data = heart_scale_inst(1:150,:);
train_label = heart_scale_label(1:150,:);

%read somewhere that the kernel should not be sparse
ttrain = full(train_data)';
ttest = full(test_data)';

precKernel = chi2_custom(ttrain', ttrain');
model_precomputed = svmtrain2(train_label, [(1:150)', precKernel], '-t 4');

This is how the kernel is precomputed:

function res=chi2_custom(x,y)
a=size(x);
b=size(y);
res = zeros(a(1,1), b(1,1));
for i=1:a(1,1)
    for j=1:b(1,1)
        resHelper = chi2_ireneHelper(x(i,:), y(j,:));
        res(i,j) = resHelper;
    end
end
function resHelper = chi2_ireneHelper(x,y)
a=(x-y).^2;
b=(x+y);
resHelper = sum(a./(b + eps));

With a different svm implementation (vlfeat) I obtain a classification rate on the training data (yes, I tested on the training data, just to see what is going on) around 90%. So I am pretty sure the libsvm result is wrong.

like image 571
Sallos Avatar asked Aug 24 '11 15:08

Sallos


1 Answers

When working with support vector machines, it is very important to normalize the dataset as a pre-processing step. Normalization puts the attributes on the same scale and prevents attributes with large values from biasing the result. It also improves numerical stability (minimizes the likelihood of overflows and underflows due to floating-point representation).

Also to be exact, your calculation of the Chi-squared kernel is slightly off. Instead take the definition below, and use this faster implementation for it:

chi_squared_kernel

function D = chi2Kernel(X,Y)
    D = zeros(size(X,1),size(Y,1));
    for i=1:size(Y,1)
        d = bsxfun(@minus, X, Y(i,:));
        s = bsxfun(@plus, X, Y(i,:));
        D(:,i) = sum(d.^2 ./ (s/2+eps), 2);
    end
    D = 1 - D;
end

Now consider the following example using the same dataset as you (code adapted from a previous answer of mine):

%# read dataset
[label,data] = libsvmread('./heart_scale');
data = full(data);      %# sparse to full

%# normalize data to [0,1] range
mn = min(data,[],1); mx = max(data,[],1);
data = bsxfun(@rdivide, bsxfun(@minus, data, mn), mx-mn);

%# split into train/test datasets
trainData = data(1:150,:);    testData = data(151:270,:);
trainLabel = label(1:150,:);  testLabel = label(151:270,:);
numTrain = size(trainData,1); numTest = size(testData,1);

%# compute kernel matrices between every pairs of (train,train) and
%# (test,train) instances and include sample serial number as first column
K =  [ (1:numTrain)' , chi2Kernel(trainData,trainData) ];
KK = [ (1:numTest)'  , chi2Kernel(testData,trainData)  ];

%# view 'train vs. train' kernel matrix
figure, imagesc(K(:,2:end))
colormap(pink), colorbar

%# train model
model = svmtrain(trainLabel, K, '-t 4');

%# test on testing data
[predTestLabel, acc, decVals] = svmpredict(testLabel, KK, model);
cmTest = confusionmat(testLabel,predTestLabel)

%# test on training data
[predTrainLabel, acc, decVals] = svmpredict(trainLabel, K, model);
cmTrain = confusionmat(trainLabel,predTrainLabel)

The result on the testing data:

Accuracy = 84.1667% (101/120) (classification)
cmTest =
    62     8
    11    39

and on the training data, we get around 90% accuracy as you expected:

Accuracy = 92.6667% (139/150) (classification)
cmTrain =
    77     3
     8    62

train_train_kernel_matrix

like image 66
Amro Avatar answered Sep 21 '22 23:09

Amro