Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get all alpha values of scikit-learn SVM classifier?

I need the alpha values, which are the Lagrange multipliers of the SVM dual problem, after training a SVM classifier with scikit-learn. According to the document, it seems that scikit-learn provides only svm.dual_coef_, which is the product of the Lagrange multiplier alpha and the label of a data point.

I tried to calculate the alpha value manually by dividing the elements of svm.dual_coef_ by the data label, but since svm.dual_coef_ stores only the coefficients of the support vectors, I'm not sure if I iterate over this array, the order of support vectors would be the same as the order in the original training data.

So is there a reliable way to get the alpha values of support vectors?

like image 708
pjhades Avatar asked Nov 22 '15 22:11

pjhades


People also ask

What is Alpha in SVM?

Lagrangian multiplier, usually denoted by α is a vector of the weights of all the training points as support vectors.

What is C in SVC Sklearn?

The C parameter tells the SVM optimization how much you want to avoid misclassifying each training example. For large values of C, the optimization will choose a smaller-margin hyperplane if that hyperplane does a better job of getting all the training points classified correctly.

How do I get support vectors in SVM?

According to the SVM algorithm we find the points closest to the line from both the classes. These points are called support vectors. Now, we compute the distance between the line and the support vectors. This distance is called the margin.

What values can be used for kernel parameter of SVC class?

It is the kernel coefficient for kernels 'rbf', 'poly' and 'sigmoid'. If you choose default i.e. gamma = 'scale' then the value of gamma to be used by SVC is 1/(𝑛_𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠∗𝑋. 𝑣𝑎𝑟()). On the other hand, if gamma= 'auto', it uses 1/𝑛_𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠.


1 Answers

As alpha values are by definition positive you can get it through taking abs of dual_coefs:

alphas = np.abs(svm.dual_coef_)

whis is a direct consequence of the fact that

svm.dual_coef_[i] = labels[i] * alphas[i]

where labels[i] is either -1 or +1 and alphas[i] are always positive. Futhermore, you can also get each label through

labels = np.sign(svm.dual_coef_)

using the same observation. This is also why scikit-learn does not store alphas as such - they are uniquely represented by dual_coefs_, together with labels.

It is easy to understand it once you analyze all possible cases:

  • labels[i] == -1 and alphas[i] > 0 => dual_coef_[i] < 0 and dual_coef_[i] == -alphas[i] == labels[i] * alphas[i]
  • labels[i] == -1 and alphas[i] < 0 => impossible (alphas are non-negative)
  • labels[i] == -1 and alphas[i]== 0 => it is not a support vector
  • labels[i] == +1 and alphas[i] > 0 => dual_coef_[i] > 0 and dual_coef_[i] == alphas[i] == labels[i] * alphas[i]
  • labels[i] == +1 and alphas[i] < 0 => impossible (alphas are non-negative)
  • labels[i] == +1 and alphas[i]== 0 => it is not a support vector

Consequently, if dual_coef_[i] is positive then it is the alphas[i] coefficient, and it belongs to positive class, and if it is negative, alphas[i] == -dual_coef_[i] and it belongs to negative class.

like image 104
lejlot Avatar answered Oct 03 '22 16:10

lejlot