Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to specify the prior probability for scikit-learn's Naive Bayes

Tags:

I'm using the scikit-learn machine learning library (Python) for a machine learning project. One of the algorithms I'm using is the Gaussian Naive Bayes implementation. One of the attributes of the GaussianNB() function is the following:

class_prior_ : array, shape (n_classes,) 

I want to alter the class prior manually since the data I use is very skewed and the recall of one of the classes is very important. By assigning a high prior probability to that class the recall should increase.

However, I can't figure out how to set the attribute correctly. I've read the below topics already but their answers don't work for me.

How can the prior probabilities manually set for the Naive Bayes clf in scikit-learn?

How do I know what prior's I'm giving to sci-kit learn? (Naive-bayes classifiers.)

This is my code:

gnb = GaussianNB() gnb.class_prior_ = [0.1, 0.9] gnb.fit(data.XTrain, yTrain) yPredicted = gnb.predict(data.XTest) 

I figured this was the correct syntax and I could find out which class belongs to which place in the array by playing with the values but the results remain unchanged. Also no errors were given.

What is the correct way of setting the attributes of the GaussianNB algorithm from scikit-learn library?

Link to the scikit documentation of GaussianNB

like image 508
pevadi Avatar asked Jun 17 '15 15:06

pevadi


People also ask

What is class prior in Naive Bayes Sklearn?

class_prior_ is an attribute rather than parameters. Once you fit the GaussianNB(), you can get access to class_prior_ attribute. It is calculated by simply counting the number of different labels in your training sample.

How do you find the probability of Naive Bayes?

The conditional probability can be calculated using the joint probability, although it would be intractable. Bayes Theorem provides a principled way for calculating the conditional probability. The simple form of the calculation for Bayes Theorem is as follows: P(A|B) = P(B|A) * P(A) / P(B)

What is prior and posterior probability in Naive Bayes?

The posterior probability is calculated by updating the prior probability using Bayes' theorem. In statistical terms, the posterior probability is the probability of event A occurring given that event B has occurred.

How do you use Naive Bayes Sklearn?

Step 1: Calculate the prior probability for given class labels. Step 2: Find Likelihood probability with each attribute for each class. Step 3: Put these value in Bayes Formula and calculate posterior probability. Step 4: See which class has a higher probability, given the input belongs to the higher probability class.


1 Answers

@Jianxun Li: there is in fact a way to set prior probabilities in GaussianNB. It's called 'priors' and its available as a parameter. See documentation: "Parameters: priors : array-like, shape (n_classes,) Prior probabilities of the classes. If specified the priors are not adjusted according to the data." So let me give you an example:

from sklearn.naive_bayes import GaussianNB # minimal dataset X = [[1, 0], [1, 0], [0, 1]] y = [0, 0, 1] # use empirical prior, learned from y mn = GaussianNB() print mn.fit(X,y).predict([1,1]) print mn.class_prior_  >>>[0] >>>[ 0.66666667  0.33333333] 

But if you changed the prior probabilities, it will give a different answer which is what you are looking for I believe.

# use custom prior to make 1 more likely mn = GaussianNB(priors=[0.1, 0.9]) mn.fit(X,y).predict([1,1]) >>>>array([1]) 
like image 135
Ram Seshadri Avatar answered Oct 13 '22 13:10

Ram Seshadri