Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Accuracy of LibSVM decreases





After getting my testlabel and trainlabel, i implemented SVM on libsvm and i got an accuracy of 97.4359%. ( c= 1 and g = 0.00375)

model = svmtrain(TrainLabel, TrainVec, '-c 1 -g 0.00375');
[predict_label, accuracy, dec_values] = svmpredict(TestLabel, TestVec, model);

After i find the best c and g,

bestcv = 0;
for log2c = -1:3,
  for log2g = -4:1,
    cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
    cv = svmtrain(TrainLabel,TrainVec, cmd);
    if (cv >= bestcv),
      bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
    fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);

c = 8 and g = 0.125

I implement the model again:

 model = svmtrain(TrainLabel, TrainVec, '-c 8 -g 0.125');
[predict_label, accuracy, dec_values] = svmpredict(TestLabel, TestVec, model);

I get an accuracy of 82.0513%

How is it possible for the accuracy to decrease? shouldn't it increase? Or am i making any mistake?

like image 947
lakshmen Avatar asked Jan 20 '12 17:01


People also ask

What affects SVM accuracy?

Different model parameters affect the prediction accuracy of SVM model differently. Training sample size can also influence the prediction accuracy of SVM model. The method of determining the optimal SVM regression model is summarized. Prediction accuracy of SVM model improves greatly by applying the method promoted.

Why accuracy of model is low?

If your model's accuracy on your testing data is lower than your training or validation accuracy, it usually indicates that there are meaningful differences between the kind of data you trained the model on and the testing data you're providing for evaluation.

How does SVM predict accuracy?

Accuracy can be computed by comparing actual test set values and predicted values. Well, you got a classification rate of 96.49%, considered as very good accuracy. For further evaluation, you can also check precision and recall of model.

1 Answers

The accuracies that you were getting during parameter tuning are biased upwards because you were predicting the same data that you were training. This is often fine for parameter tuning.

However, if you wanted those accuracies to be accurate estimates of the true generalization error on your final test set, then you have to add an additional wrap of cross validation or other resampling scheme.

Here is a very clear paper that outlines the general issue (but in a similar context of feature selection): http://www.pnas.org/content/99/10/6562.abstract


I usually add cross validation like:

n     = 95 % total number of observations
nfold = 10 % desired number of folds

% Set up CV folds
inds = repmat(1:nfold, 1, mod(nfold, n))
inds = inds(randperm(n))

% Loop over folds
for i = 1:nfold
  datapart = data(inds ~= i, :)

  % do some stuff

  % save results

% combine results
like image 194
John Colby Avatar answered Sep 23 '22 13:09

John Colby