Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform 10 fold cross validation with LibSVM in R?

I know that in MatLab this is really easy ('-v 10').

But I need to do it in R. I did find one comment about adding cross = 10 as parameter would do it. But this is not confirmed in the help file so I am sceptical about it.

svm(Outcome ~. , data= source, cost = 100, gamma =1, cross=10)

Any examples of a successful SVM script for R would also be appreciated as I am still running into some dead ends?

Edit: I forgot to mention outside of the tags that I use the libsvm package for this.

like image 313
Sigvard Avatar asked Nov 12 '12 16:11

Sigvard


2 Answers

I am also trying to perform a 10 fold cross validation. I think that using tune is not the right way in order to perform it, since this function is used to optimize the parameters, but not to train and test the model.

I have the following code to perform a Leave-One-Out cross validation. Suppose that dataset is a data.frame with your data stored. In each LOO step, the observed vs. predicted matrix is added, so that at the end, result contains the global observed vs. predicted matrix.

#LOOValidation
for (i in 1:length(dataset)){
    fit = svm(classes ~ ., data=dataset[-i,], type='C-classification', kernel='linear')
    pred = predict(fit, dataset[i,])
    result <- result + table(true=dataset[i,]$classes, pred=pred);
}
classAgreement(result)

So in order to perform a 10-fold cross validation, I guess we should manually partition the dataset, and use the folds to train and test the model.

for (i in 1:10)
    train <- getFoldTrainSet(dataset, i)
    test <- getFoldTestSet(dataset,i)
    fit = svm(classes ~ ., train, type='C-classification', kernel='linear')
    pred = predict(fit, test)
    results <- c(results,table(true=test$classes, pred=pred));

}
# compute mean accuracies and kappas ussing results, which store the result of each fold

I hope this help you.

like image 59
hlfernandez Avatar answered Nov 14 '22 02:11

hlfernandez


Here is a simple way to create 10 test and training folds using no packages:

#Randomly shuffle the data
yourData<-yourData[sample(nrow(yourData)),]

#Create 10 equally size folds
folds <- cut(seq(1,nrow(yourData)),breaks=10,labels=FALSE)

#Perform 10 fold cross validation
for(i in 1:10){
    #Segement your data by fold using the which() function 
    testIndexes <- which(folds==i,arr.ind=TRUE)
    testData <- yourData[testIndexes, ]
    trainData <- yourData[-testIndexes, ]
    #Use test and train data howeever you desire...
}
like image 42
Jake Drew Avatar answered Nov 14 '22 02:11

Jake Drew