Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save Naive Bayes Trained Classifier in NLTK

People also ask

How do you save a naive Bayes model?

1 Answer. You can use Python's Pickle library to save most of the machine learning model and you can also restore the saved models later using same library.

Does NLTK use naive Bayes?

NLTK (Natural Language Toolkit) provides Naive Bayes classifier to classify text data.


To save:

import pickle
f = open('my_classifier.pickle', 'wb')
pickle.dump(classifier, f)
f.close()

To load later:

import pickle
f = open('my_classifier.pickle', 'rb')
classifier = pickle.load(f)
f.close()

I went thru the same problem, and you cannot save the object since is a ELEFreqDistr NLTK class. Anyhow NLTK is hell slow. Training took 45 mins on a decent set and I decided to implement my own version of the algorithm (run it with pypy or rename it .pyx and install cython). It takes about 3 minutes with the same set and it can simply save data as json (I'll implement pickle which is faster/better).

I started a simple github project, check out the code here


To Retrain the Pickled Classifer :

f = open('originalnaivebayes5k.pickle','rb')
classifier = pickle.load(f)
classifier.train(training_set)
print('Accuracy:',nltk.classify.accuracy(classifier,testing_set)*100)
f.close()