Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I know what prior's I'm giving to sci-kit learn? (Naive-bayes classifiers.)

In sci-kit learn's naive bayesian classifiers you can specify the prior probabilities, and the classifier will use those provided probabilities in it's calculations. But I don't know how the prior probabilities should be ordered.

from sklearn.naive_bayes import BernoulliNB
data = [[0], [1]]
classes = ['light bulb', 'door mat']
classes.shuffle()  # This simulates getting classes from a complex source.
classifier = BernoulliNB(class_prior=[0, 1])  # Here we provide prior probabilities.
classifier.fit(data, classes)

In the above code, how do I know which class is assumed to be the 100% prior? Do I need to consider the order of the classes in the data before specifying prior probabilities?

I would also be interested in knowing where this documented.

like image 371
Buttons840 Avatar asked Jan 07 '14 20:01

Buttons840


1 Answers

It seems to be undocumented. When fit, target is preprocessed by LabelBinarizer, so you can get your data's classes with

from sklearn.preprocessing import LabelBinarizer
labelbin = LabelBinarizer()
labelbin.fit_transform(classes)

Then labelbin.classes_ contains resulting classes for your target data (classes), in order corresponding to one of priors.

like image 85
alko Avatar answered Sep 22 '22 07:09

alko