Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AttributeError:'LinearSVC' object has no attribute 'predict_proba'

I am trying to use LinearSVC classifier

Update: Added imports

import nltk
from nltk.tokenize import word_tokenize
from nltk.classify.scikitlearn import SklearnClassifier
from sklearn.svm import LinearSVC, SVC

LinearSVC_classifier = SklearnClassifier(LinearSVC())
LinearSVC_classifier.train(featuresets)

But when I am trying to classify it with probabilities

LinearSVC_classifier.prob_classify(feats)

AttributeError occurs:

AttributeError:'LinearSVC' object has no attribute 'predict_proba'

I checked sklearn documentation, it tells that this function exist.

How to fix that?

like image 750
Aleksandr Baranov Avatar asked Nov 15 '17 16:11

Aleksandr Baranov


4 Answers

According to sklearn documentation , the method 'predict_proba' is not defined for 'LinearSVC'

Workaround:

LinearSVC_classifier = SklearnClassifier(SVC(kernel='linear',probability=True))

Use SVC with linear kernel, with probability argument set to True. Just as explained in here .

like image 144
mdilip Avatar answered Nov 12 '22 21:11

mdilip


You can use _predict_proba_lr() instead predict_proba. Something like this:

from sklearn import svm
clf=svm.LinearSVC()

clf.fit(X_train,Y_train)

res= clf._predict_proba_lr(X_test,Y_test)

res would be a 2d array of probabilities of each classes against samples.

like image 20
Sina Avatar answered Nov 12 '22 22:11

Sina


Given your question, there is no mentioning about some outside-wrapper like NLTK (except for the tag), so it's hard to grasp what you really need!

Vivek Kumar's comment applies. LinearSVC has no support for probabilities, while SVC does.

Now some additional remarks:

  • SVM-theory is not much about probabilities and the support for this comes from extra-approaches using cross-validation and an additional classifier
    • see Platt scaling
  • the core-solver of LinearSVC, liblinear has not inbuilt-support for this
  • the approach of mdilip above is a valid workaround, but:
    • SVC is based on libsvm and therefore slower (and maybe not ready for large-scale)
  • alternative: build your own pipeline consisting of:
    • LinearSVC
    • sklearn's probability-calibration

It seems someone observed this problem before.

like image 5
sascha Avatar answered Nov 12 '22 21:11

sascha


This can happen if there is a mistmatch between scikit-learn module versions between trained model and the predicted model.

like image 1
ThusharaJ Avatar answered Nov 12 '22 21:11

ThusharaJ