Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hashingvectorizer and Multinomial naive bayes are not working together

I am trying to write a twitter sentiment analysis program with Scikit-learn in python 2.7. OS is Linux Ubuntu 14.04.

In Vectorizing step, I want to use Hashingvectorizer(). To test the classifier accuracy it works fine with LinearSVC, NuSVC, GaussianNB, BernoulliNB and LogisticRegression classifiers, but for MultinomialNB, it returns this error

Traceback (most recent call last):
  File "/media/test.py", line 310, in <module>
    classifier_rbf.fit(train_vectors, y_trainTweets)
  File "/home/.local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 552, in fit
    self._count(X, Y)
  File "/home/.local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 655, in _count
    raise ValueError("Input X must be non-negative")
ValueError: Input X must be non-negative
[Finished in 16.4s with exit code 1] 

Here is the block code related to this error

vectorizer = HashingVectorizer()
train_vectors = vectorizer.fit_transform(x_trainTweets)
test_vectors = vectorizer.transform(x_testTweets)

classifier_rbf = MultinomialNB()
classifier_rbf.fit(train_vectors, y_trainTweets)
prediction_rbf = classifier_rbf.predict(test_vectors)

Why it is happening and how can I solve it?

like image 558
ehsan badakhshan Avatar asked Apr 06 '16 16:04

ehsan badakhshan


People also ask

When multinomial naive Bayes is used?

Naive Bayes are mostly used in natural language processing (NLP) problems. Naive Bayes predict the tag of a text. They calculate the probability of each tag for a given text and then output the tag with the highest one.

What is Alpha in multinomial naive Bayes?

In Multinomial Naive Bayes, the alpha parameter is what is known as a hyperparameter; i.e. a parameter that controls the form of the model itself.

What is Multinomialnb?

The Multinomial Naive Bayes algorithm is a Bayesian learning approach popular in Natural Language Processing (NLP). The program guesses the tag of a text, such as an email or a newspaper story, using the Bayes theorem. It calculates each tag's likelihood for a given sample and outputs the tag with the greatest chance.


1 Answers

If the non_negative argument isn't available (just like my version)

Try putting : vectorizer = HashingVectorizer(alternate_sign=False)

like image 71
BILEL MOSTEFA_CHEBRA Avatar answered Sep 20 '22 17:09

BILEL MOSTEFA_CHEBRA