What is the difference between a Bayesian network and a Naive Bayes classifier? I noticed one is just implemented in Matlab as classify
the other has an entire net toolbox.
If you could explain in your answer which one is more likely to provide a better accuracy as well I would be grateful (not a pre-requisite).
Well, you need to know that the distinction between Bayes theorem and Naive Bayes is that Naive Bayes assumes conditional independence where Bayes theorem does not. This means the relationship between all input features are independent.
A Bayesian Network captures the joint probabilities of the events represented by the model. A Bayesian belief network describes the joint probability distribution for a set of variables.
In the statistics literature, naive Bayes models are known under a variety of names, including simple Bayes and independence Bayes. All these names reference the use of Bayes' theorem in the classifier's decision rule, but naive Bayes is not (necessarily) a Bayesian method.
A Bayesian network (also known as a Bayes network, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bayes' rule is used for inference in Bayesian networks, as will be shown below.
Short answer, if you're only interested in solving a prediction task: use Naive Bayes.
A Bayesian network (has a good wikipedia page) models relationships between features in a very general way. If you know what these relationships are, or have enough data to derive them, then it may be appropriate to use a Bayesian network.
A Naive Bayes classifier is a simple model that describes particular class of Bayesian network - where all of the features are class-conditionally independent. Because of this, there are certain problems that Naive Bayes cannot solve (example below). However, its simplicity also makes it easier to apply, and it requires less data to get a good result in many cases.
You have a learning problem with binary features x1
and x2
and a target variable y = x1 XOR x2
.
In a Naive Bayes classifier, x1
and x2
must be treated independently - so you would compute things like "The probability that y = 1
given that x1 = 1
" - hopefully you can see that this isn't helpful, because x1 = 1
doesn't make y = 1
any more or less likely. Since a Bayesian network does not assume independence, it would be able to solve such a problem.
Naive Bayes is just a restricted/constrained form of a general Bayesian network where you enforce the constraint that the class node should have no parents and that the nodes corresponding to the attribute variables should have no edges between them. As such, there is nothing that prevents a general Bayesian network from being used for classification - the predicted class is the one with the maximum probability when (conditioned on) all the other variables are set to the prediction instance values in the usual Bayesian inference fashion. A good paper to read on this is "Bayesian Network Classifiers, Machine Learning, 29, 131–163 (1997)". Of particular interest is section 3. Though Naive Bayes is a constrained form of a more general Bayesian network, this paper also talks about why Naive Bayes can and does outperform a general Bayesian network in classification tasks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With