Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vastly different results for SVM model using e1071 and caret

I'm training two SVM models using two differnt packages on my data and getting vastly different results. Is this something to be expected?

model1 using e1071

library('e1071')
model1 <- svm(myFormula, data=trainset,type='C',kernel='linear',probability = TRUE)
outTrain <- predict(model1, trainset, probability = TRUE)
outTest <- predict(model1, testset, probability = TRUE)
train_pred <- attr(outTrain, "probabilities")[,2]
test_pred <- attr(outTest, "probabilities")[,2]
calculateAUC(train_pred,trainTarget)
calculateAUC(test_pred,testTarget)

model2 using caret

model2 <- train(myFormula,data=trainset,method='svmLinear')
train_pred <- predict(model2, trainset)
test_pred  <- predict(model2, testset)
calculateAUC(train_pred,trainTarget)
calculateAUC(test_pred,testTarget)

calculateAUC() is a function I defined to calculate the AUC value, given the predicted and the actual values of the target. I see the values as:

model1 (e1071)

1
0.8567979

model2 (caret)

0.9910193
0.758201

Is this something that is possible? Or am I doing this wrong?

I can provide sample data if that will be helpful

like image 988
user2175594 Avatar asked Sep 20 '13 07:09

user2175594


3 Answers

Yes, it is possible, due to for example:

  • Different C values, in e1071 default value is 1, maybe caret uses other?
  • Data scaling, e1071 scales your input by default, caret does not scale by default (although kernlab's svm does, and it is an "under the hood" model, so it would require source checking to be sure)
  • different eps/maxiteration or other optimization related threshold

Simply display your models parameters after learning and check whether they are the same, you will probably find some parameter which by default is different between these two libraries.

like image 74
lejlot Avatar answered Oct 17 '22 05:10

lejlot


I have observed that kernlab uses rbfkernel as,

rbf(x,y) = exp(-sigma * euclideanNorm(x-y)^2)

but according to this wiki link, the rbf kernel should be

rbf(x,y) = exp(-euclideanNorm(x-y)^2/(2*sigma^2))

which is also more intuitive since two close samples with a large sigma value will lead to a higher similarity matching.

I am not sure what e1071 svm uses (native code libsvm?)

I know this is an old thread, but hope someone can enlighten me on why there is a difference ? A small example for comparison

set.seed(123)
x <- rnorm(3)
y <- rnorm(3)
sigma <- 100

rbf <- rbfdot(sigma=sigma)
rbf(x, y)
exp( -sum((x-y)^2)/(2*sigma^2) )

I would expect the kernel value to be close to 1 (since x,y come from sigma=1, while kernel sigma=100). This is observed only in the second case.

like image 33
jMathew Avatar answered Oct 17 '22 05:10

jMathew


First note that svmLinear relies on kernlab. You can directly use e1071 from caret simply replacing svmLinear argument by svmLinear2 (see the detailed list of models and the library they depend on in the docs).

Now, note that the two libraries produce identical results, provided you pass them the right parameters. I recently benchmarked these methods and noted that passing the following parameters ensures the same results:

model_kernlab <-
  kernlab::ksvm(
      x = X,
      y = Y,
      scaled = TRUE,
      C = 5,
      kernel = "rbfdot",
      kpar = list(sigma = 1),
      type = "eps-svr",
      epsilon = 0.1
      )

model_e1071 <- e1071::svm(x = X,
      y = Y,
      cost = 5,
      scale = TRUE, 
      kernel = "radial",
      gamma = 1,
      type = "eps-regression",
      epsilon = 0.1)

Note the different names : - C / cost - sigma / gamma - eps / epsilon - rbfdot / radial ...

like image 28
RUser4512 Avatar answered Oct 17 '22 05:10

RUser4512