How to specify the prior probability for scikit-learn's Naive Bayes

Tags:

I'm using the scikit-learn machine learning library (Python) for a machine learning project. One of the algorithms I'm using is the Gaussian Naive Bayes implementation. One of the attributes of the GaussianNB() function is the following:

class_prior_ : array, shape (n_classes,)

I want to alter the class prior manually since the data I use is very skewed and the recall of one of the classes is very important. By assigning a high prior probability to that class the recall should increase.

However, I can't figure out how to set the attribute correctly. I've read the below topics already but their answers don't work for me.

How can the prior probabilities manually set for the Naive Bayes clf in scikit-learn?

How do I know what prior's I'm giving to sci-kit learn? (Naive-bayes classifiers.)

This is my code:

gnb = GaussianNB() gnb.class_prior_ = [0.1, 0.9] gnb.fit(data.XTrain, yTrain) yPredicted = gnb.predict(data.XTest)

I figured this was the correct syntax and I could find out which class belongs to which place in the array by playing with the values but the results remain unchanged. Also no errors were given.

What is the correct way of setting the attributes of the GaussianNB algorithm from scikit-learn library?

Link to the scikit documentation of GaussianNB

508

asked Jun 17 '15 15:06

pevadi

1 Answers

@Jianxun Li: there is in fact a way to set prior probabilities in GaussianNB. It's called 'priors' and its available as a parameter. See documentation: "Parameters: priors : array-like, shape (n_classes,) Prior probabilities of the classes. If specified the priors are not adjusted according to the data." So let me give you an example:

from sklearn.naive_bayes import GaussianNB # minimal dataset X = [[1, 0], [1, 0], [0, 1]] y = [0, 0, 1] # use empirical prior, learned from y mn = GaussianNB() print mn.fit(X,y).predict([1,1]) print mn.class_prior_  >>>[0] >>>[ 0.66666667  0.33333333]

But if you changed the prior probabilities, it will give a different answer which is what you are looking for I believe.

# use custom prior to make 1 more likely mn = GaussianNB(priors=[0.1, 0.9]) mn.fit(X,y).predict([1,1]) >>>>array([1])

135

answered Oct 13 '22 13:10

Ram Seshadri

Related questions
                            
                                redshift equivalent of TEXT data type
                            
                                Get the user's email address from Azure AD via OpenID Connect
                            
                                android studio error disappears in logCat after crash
                            
                                Is right-to-left operator associativity in R possible?
                            
                                x86_64 : is stack frame pointer almost useless?
                            
                                Invalid Python SDK Error while using python 3.4 on PyCharm
                            
                                HOWTO create GoogleCredential by using Service Account JSON
                            
                                what does ..level.. mean in ggplot::stat_density2d
                            
                                Laravel 5 Ajax File/Image Upload
                            
                                Jackson deserialization on multiple types
                            
                                HSSFWorkbook vs XSSFWorkbook vs SXSSFWorkbook - Apache-poi
                            
                                lxml will never finish building on ubuntu

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With