In sci-kit learn's naive bayesian classifiers you can specify the prior probabilities, and the classifier will use those provided probabilities in it's calculations. But I don't know how the prior probabilities should be ordered.
from sklearn.naive_bayes import BernoulliNB
data = [[0], [1]]
classes = ['light bulb', 'door mat']
classes.shuffle() # This simulates getting classes from a complex source.
classifier = BernoulliNB(class_prior=[0, 1]) # Here we provide prior probabilities.
classifier.fit(data, classes)
In the above code, how do I know which class is assumed to be the 100% prior? Do I need to consider the order of the classes in the data before specifying prior probabilities?
I would also be interested in knowing where this documented.
It seems to be undocumented. When fit, target is preprocessed by LabelBinarizer
, so you can get your data's classes with
from sklearn.preprocessing import LabelBinarizer
labelbin = LabelBinarizer()
labelbin.fit_transform(classes)
Then labelbin.classes_
contains resulting classes for your target data (classes
), in order corresponding to one of priors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With