Dealing with negative values in sklearn MultinomialNB

Tags:

I am normalizing my text input before running MultinomialNB in sklearn like this:

vectorizer = TfidfVectorizer(max_df=0.5, stop_words='english', use_idf=True)
lsa = TruncatedSVD(n_components=100)
mnb = MultinomialNB(alpha=0.01)

train_text = vectorizer.fit_transform(raw_text_train)
train_text = lsa.fit_transform(train_text)
train_text = Normalizer(copy=False).fit_transform(train_text)

mnb.fit(train_text, train_labels)

Unfortunately, MultinomialNB does not accept the non-negative values created during the LSA stage. Any ideas for getting around this?

950

asked Jun 11 '14 17:06

seanlorenz

1 Answers

I recommend you that don't use Naive Bayes with SVD or other matrix factorization because Naive Bayes based on applying Bayes' theorem with strong (naive) independence assumptions between the features. Use other classifier, for example RandomForest

I tried this experiment with this results:

vectorizer = TfidfVectorizer(max_df=0.5, stop_words='english', use_idf=True)
lsa = NMF(n_components=100)
mnb = MultinomialNB(alpha=0.01)

train_text = vectorizer.fit_transform(raw_text_train)
train_text = lsa.fit_transform(train_text)
train_text = Normalizer(copy=False).fit_transform(train_text)

mnb.fit(train_text, train_labels)

This is the same case but I'm using NMP(non-negative matrix factorization) instead SVD and got 0,04% accuracy.

Changing the classifier MultinomialNB for RandomForest i got 79% accuracy.

Therefore change the classifier or don't apply a matrix factorization.

194

answered Sep 20 '22 13:09

Martin Forte

Related questions
                            
                                from where SSL ConnectionResetError comes from?
                            
                                Anaconda Navigator does not update packages
                            
                                Import python module in flutter using starflut
                            
                                How do I print to the OS's default printer in Python 3 (cross platform)?
                            
                                Dataflow computing in python
                            
                                from <module> import ... in __init__.py makes module name visible?
                            
                                Urllib and validation of server certificate
                            
                                Matplotlib: Label points on mouseover
                            
                                Rolling out a web authentication system
                            
                                Is threre a RoboCode like Game or Challenge for Python? [closed]
                            
                                Inconsistency between sed and python regular expressions
                            
                                fuzzy string matching with term weights
                            
                                django-mutant creating models in django-admin
                            
                                How to have drag-and-drop and sorted GtkTreeView in GTK3?
                            
                                Python threads and queue example
                            
                                Saving dictionaries to file (numpy and Python 2/3 friendly)
                            
                                Selenium WebDriver: Firefox starts, but does not open the URL
                            
                                deque in python pandas
                            
                                Add subtotal columns in pandas with multi-index
                            
                                Using abstract base class VS plain inheritance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Dealing with negative values in sklearn MultinomialNB

Tags:

python

scikit-learn

multinomial

seanlorenz

People also ask

1 Answers

Martin Forte

Recent Activity

Donate For Us