Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SVM with cross validation in R using caret

Tags:

r

svm

r-caret

I was told to use the caret package in order to perform Support Vector Machine regression with 10 fold cross validation on a data set I have. I'm plotting my response variable against 151 variables. I did the following:-

> ctrl <- trainControl(method = "repeatedcv", repeats = 10)
> set.seed(1500)
> mod <- train(RT..seconds.~., data=cadets, method = "svmLinear", trControl = ctrl)

in which I got

C    RMSE  Rsquared  RMSE SD  Rsquared SD
  0.2  50    0.8       20       0.1        
  0.5  60    0.7       20       0.2        
  1    60    0.7       20       0.2   

But I want to be able to have a look at my folds, and for each of them how close the predicted values were to the actual values. How do I go about looking at this?

Also, it says that:-

RMSE was used to select the optimal model using  the smallest value.
The final value used for the model was C = 0.

I was just wondering what this meant and what the C stands for in the table above?

RT (seconds)    76_TI2  114_DECC    120_Lop 212_PCD 236_X3Av
38  4.086   1.2 2.322   0   0.195
40  2.732   0.815   1.837   1.113   0.13
41  4.049   1.153   2.117   2.354   0.094
41  4.049   1.153   2.117   3.838   0.117
42  4.56    1.224   2.128   2.38    0.246
42  2.96    0.909   1.686   0.972   0.138
42  3.237   0.96    1.922   1.202   0.143
44  2.989   0.8 1.761   2.034   0.11
44  1.993   0.5 1.5 0   0.102
44  2.957   0.8 1.761   0.988   0.141
44  2.597   0.889   1.888   1.916   0.114
44  2.428   0.691   1.436   1.848   0.089

This is a snipet of my dataset. I'm trying to pot RT seconds against 151 variables.

Thanks

like image 779
user2062207 Avatar asked Dec 09 '13 01:12

user2062207


People also ask

Can you compare LR and SVM?

LR vs SVM : SVM supports both linear and non-linear solutions using kernel trick. SVM handles outliers better than LR. Both perform well when the training data is less, and there are large number of features.

What is the caret package in R?

Caret is a one-stop solution for machine learning in R. The R package caret has a powerful train function that allows you to fit over 230 different models using one syntax. There are over 230 models included in the package including various tree-based models, neural nets, deep learning and much more.

Can SVMS be used for regression?

Overview. Support vector machine (SVM) analysis is a popular machine learning tool for classification and regression, first identified by Vladimir Vapnik and his colleagues in 1992[5]. SVM regression is considered a nonparametric technique because it relies on kernel functions.

Can SVM be used for feature selection?

Thus, the SVM model with feature selection becomes a mixed-integer model and therefore a non-convex model. Since the SVM problem without feature selection is a linear programming problem and thus a convex problem, its Pareto frontier can be obtained by varying the parameter .


1 Answers

You have to save your CV predictions via the "savePred" option in your trainControl object. I'm not sure what package your "cadets" data is from, but here is a trivial example using iris:

> library(caret)
> ctrl <- trainControl(method = "cv", savePred=T, classProb=T)
> mod <- train(Species~., data=iris, method = "svmLinear", trControl = ctrl)
> head(mod$pred)
        pred        obs      setosa  versicolor   virginica rowIndex   .C Resample
1     setosa     setosa 0.982533940 0.009013592 0.008452468       11 0.25   Fold01
2     setosa     setosa 0.955755054 0.032289120 0.011955826       35 0.25   Fold01
3     setosa     setosa 0.941292675 0.044903583 0.013803742       46 0.25   Fold01
4     setosa     setosa 0.983559919 0.008310323 0.008129757       49 0.25   Fold01
5     setosa     setosa 0.972285699 0.018109218 0.009605083       50 0.25   Fold01
6 versicolor versicolor 0.007223973 0.971168170 0.021607858       59 0.25   Fold01

EDIT: The "C" is one of tuning parameters for your SVM. Check out the help for the ksvm function in the kernlab package for more details.

EDIT2: Trivial regression example

> library(caret)
> ctrl <- trainControl(method = "cv", savePred=T)
> mod <- train(Sepal.Length~., data=iris, method = "svmLinear", trControl = ctrl)
> head(mod$pred)
      pred obs rowIndex   .C Resample
1 4.756119 4.8       13 0.25   Fold01
2 4.910948 4.8       31 0.25   Fold01
3 5.094275 4.9       38 0.25   Fold01
4 4.728503 4.8       46 0.25   Fold01
5 5.192965 5.3       49 0.25   Fold01
6 5.969479 5.9       62 0.25   Fold01
like image 172
David Avatar answered Sep 17 '22 01:09

David