I'm trying to do the following simple classification using the LinearSVC
object in scikit-learn
. I've tried using both version 0.10 and 0.14. Using the code:
from sklearn.svm import LinearSVC, SVC
from numpy import *
data = array([[ 1007., 1076.],
[ 1017., 1009.],
[ 2021., 2029.],
[ 2060., 2085.]])
groups = array([1, 1, 2, 2])
svc = LinearSVC()
svc.fit(data, groups)
svc.predict(data)
I get the output:
array([2, 2, 2, 2])
However, if I replace the classifier with
svc = SVC(kernel='linear')
then I get the result
array([ 1., 1., 2., 2.])
which is correct. Does anyone know why using LinearSVC
would botch this simple problem?
The main difference between them is linearsvc lets your choose only linear classifier whereas svc let yo choose from a variety of non-linear classifiers. however it is not recommended to use svc for non-linear problems as they are super slow.
Between SVC and LinearSVC , one important decision criterion is that LinearSVC tends to be faster to converge the larger the number of samples is. This is due to the fact that the linear kernel is a special case, which is optimized for in Liblinear, but not in Libsvm.
Linear Support Vector Machine (Linear SVC) is an algorithm that attempts to find a hyperplane to maximize the distance between classified samples.
SVC, or Support Vector Classifier, is a supervised machine learning algorithm typically used for classification tasks. SVC works by mapping data points to a high-dimensional space and then finding the optimal hyperplane that divides the data into two classes.
The algorithm underlying LinearSVC
is very sensitive to extreme values in its input:
>>> svc = LinearSVC(verbose=1)
>>> svc.fit(data, groups)
[LibLinear]....................................................................................................
optimization finished, #iter = 1000
WARNING: reaching max number of iterations
Using -s 2 may be faster (also see FAQ)
Objective value = -0.001256
nSV = 4
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
intercept_scaling=1, loss='l2', multi_class='ovr', penalty='l2',
random_state=None, tol=0.0001, verbose=1)
(The warning refers to the LibLinear FAQ, since scikit-learn's LinearSVC
is based on that library.)
You should normalize before fitting:
>>> from sklearn.preprocessing import scale
>>> data = scale(data)
>>> svc.fit(data, groups)
[LibLinear]...
optimization finished, #iter = 39
Objective value = -0.240988
nSV = 4
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
intercept_scaling=1, loss='l2', multi_class='ovr', penalty='l2',
random_state=None, tol=0.0001, verbose=1)
>>> svc.predict(data)
array([1, 1, 2, 2])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With