Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the meaning of the nu parameter in Scikit-Learn's SVM class?

I am following the example shown in http://scikit-learn.org/stable/auto_examples/svm/plot_oneclass.html#example-svm-plot-oneclass-py, where a one class SVM is used for anomaly detection. Now, this may be a notation unique to scikit-learn, but I couldn't find an explanation of how to use the parameter nu given to the OneClassSVM constructor.

In http://scikit-learn.org/stable/modules/svm.html#nusvc, it is stated that the parameter nu is a reparametrization of the parameter C (which is the regularization parameter which I am familiar with) - but doesn't state how to perform that reparameterization.

Both a formula and an intuition will be much appreciated.

Thanks!

like image 984
Guy Adini Avatar asked Jun 27 '12 16:06

Guy Adini


People also ask

What is Nu in SVM?

According to this link, nu specifies the nu parameter for the one-class SVM model. The nu parameter is both a lower bound for the number of samples that are support vectors and an upper bound for the number of samples that are on the wrong side of the hyperplane. The default is 0.1.

What is Nu SVC?

The nu-support vector classifier (Nu-SVC) is similar to the SVC with the only difference that the nu-SVC classifier has a nu parameter to control the number of support vectors. In this tutorial, we'll briefly learn how to classify data by using Scikit-learn's NuSVC class in Python.

What is Nu SVR?

Based on support vector machines method, Nu Support Vector Regression (NuSVR) is an algorithm to solve the regression problems. The NuSVR algorithm applies nu parameter by replacing the the epsilon parameter of SVR method.

What is C parameter in SVM?

The C parameter tells the SVM optimization how much you want to avoid misclassifying each training example. For large values of C, the optimization will choose a smaller-margin hyperplane if that hyperplane does a better job of getting all the training points classified correctly.


1 Answers

The problem with C and the introduction of nu

The problem with the parameter C is:

  1. that it can take any positive value
  2. that it has no direct interpretation.

It is therefore hard to choose correctly and one has to resort to cross validation or direct experimentation to find a suitable value.

In response Schölkopf et al. reformulated SVM to take a new regularization parameter nu. This parameter is:

  1. bounded between 0 and 1
  2. has a direct interpretation

Interpretation of nu

The parameter nu is an upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors relative to the total number of training examples. For example, if you set it to 0.05 you are guaranteed to find at most 5% of your training examples being misclassified (at the cost of a small margin, though) and at least 5% of your training examples being support vectors.

Relationship between C and nu

The relation between C and nu is governed by the following formula:

nu = A+B/C

A and B are constants which are unfortunately not that easy to calculate.

Conclusion

The takeaway message is that C and nu SVM are equivalent regarding their classification power. The regularization in terms of nu is easier to interpret compared to C, but the nu SVM is usually harder to optimize and runtime doesn't scale as well as the C variant with number of input samples.

More details (including formulas for A and B) can be found here: Chang CC, Lin CJ - "Training nu-support vector classifiers: theory and algorithms"

like image 90
Bernhard Kausler Avatar answered Oct 11 '22 02:10

Bernhard Kausler