Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to obtain the training error in svm of Scikit-learn?

My question: How do I obtain the training error in the svm module (SVC class)?

I am trying to do a plot of error of the train set and test set against the number of training data used ( or other features such as C / gamma ). However, according to the SVM documentation , there is no such exposed attribute or method to return such data. I did find that RandomForestClassifier does expose a oob_score_ though.

like image 716
log0 Avatar asked Jul 30 '13 17:07

log0


People also ask

How do you calculate training error?

This is called the training error; it is the same as 1/n× sum of squared residuals we studied earlier. Of course, based on our discussion of bias and variance, we should expect that training error is too optimistic relative to the error on a new test set. E[(Y − ˆ f(X))2|X,Y, X = Xi].

What is training error and what is test error?

It is very important to understand the difference between a training error and a test error. Remember that the training error is calculated by using the same data for training the model and calculating its error rate. For calculating the test error, you are using completely disjoint data sets for both tasks.

How does cross Val score work?

"cross_val_score" splits the data into say 5 folds. Then for each fold it fits the data on 4 folds and scores the 5th fold. Then it gives you the 5 scores from which you can calculate a mean and variance for the score. You crossval to tune parameters and get an estimate of the score.

How do I import cross validation in Python?

Computing cross-validated metrics. The simplest way to use cross-validation is to call the cross_val_score helper function on the estimator and the dataset. >>> from sklearn. model_selection import cross_val_score >>> clf = svm.


1 Answers

Just compute the score on the training data:

>>> model.fit(X_train, y_train).score(X_train, y_train)

You can also use any other performance metrics from the sklearn.metrics module. The doc is here:

http://scikit-learn.org/stable/modules/model_evaluation.html

Also: oob_score_ is an estimate of the test / validation score, not the training score.

like image 74
ogrisel Avatar answered Sep 20 '22 01:09

ogrisel