This is why I'm asking this question: Last year I made some C++ code to compute posterior probabilities for a particular type of model (described by a Bayesian network). The model worked pretty well and some other people started to use my software. Now I want to improve my model. Since I'm already coding slightly different inference algorithms for the new model, I decided to use python because runtime wasn't critically important and python may let me make more elegant and manageable code.
Usually in this situation I'd search for an existing Bayesian network package in python, but the inference algorithms I'm using are my own, and I also thought this would be a great opportunity to learn more about good design in python.
I've already found a great python module for network graphs (networkx), which allows you to attach a dictionary to each node and to each edge. Essentially, this would let me give nodes and edges properties.
For a particular network and it's observed data, I need to write a function that computes the likelihood of the unassigned variables in the model.
For instance, in the classic "Asia" network (http://www.bayesserver.com/Resources/Images/AsiaNetwork.png), with the states of "XRay Result" and "Dyspnea" known, I need to write a function to compute the likelihood that the other variables have certain values (according to some model).
Here is my programming question: I'm going to try a handful of models, and in the future it's possible I'll want to try another model after that. For instance, one model might look exactly like the Asia network. In another model, a directed edge might be added from "Visit to Asia" to "Has Lung Cancer." Another model might use the original directed graph, but the probability model for the "Dyspnea" node given the "Tuberculosis or Cancer" and "Has Bronchitis" nodes might be different. All of these models will compute the likelihood in a different way.
All the models will have substantial overlap; for instance, multiple edges going into an "Or" node will always make a "0" if all inputs are "0" and a "1" otherwise. But some models will have nodes that take on integer values in some range, while others will be boolean.
In the past I've struggled with how to program things like this. I'm not going to lie; there's been a fair amount of copied and pasted code and sometimes I've needed to propagate changes in a single method to multiple files. This time I really want to spend the time to do this the right way.
Some options:
Thanks a lot for your help.
Update: Object oriented ideas help a lot here (each node has a designated set of predecessor nodes of a certain node subtype, and each node has a likelihood function that computes its likelihood of different outcome states given the states of the predecessor nodes, etc.). OOP FTW!
Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms.
A Bayesian network is a compact, flexible and interpretable representation of a joint probability distribution. It is also an useful tool in knowledge discovery as directed acyclic graphs allow representing causal relations between variables. Typically, a Bayesian network is learned from data.
Bayesian network models capture both conditionally dependent and conditionally independent relationships between random variables. Models can be prepared by experts or learned from data, then used for inference to estimate the probabilities for causal or subsequent events.
I've been working on this kind of thing in my spare time for quite a while. I think I'm on my third or fourth version of this same problem right now. I'm actually getting ready to release another version of Fathom (https://github.com/davidrichards/fathom/wiki) with dynamic bayesian models included and a different persistence layer.
As I've tried to make my answer clear, it's gotten quite long. I apologize for that. Here's how I've been attacking the problem, which seems to answer some of your questions (somewhat indirectly):
I've started with Judea Pearl's breakdown of belief propagation in a Bayesian Network. That is, it's a graph with prior odds (causal support) coming from parents and likelihoods (diagnostic support) coming from children. In this way, the basic class is just a BeliefNode, much like what you described with an extra node between BeliefNodes, a LinkMatrix. In this way, I explicitly choose the type of likelihood I'm using by the type of LinkMatrix I use. It makes it easier to explain what the belief network is doing afterwards as well as keeps the computation simpler.
Any subclassing or changes that I'd make to the basic BeliefNode would be for binning continuous variables, rather than changing propagation rules or node associations.
I've decided on keeping all data inside the BeliefNode, and only fixed data in the LinkedMatrix. This has to do with ensuring that I maintain clean belief updates with minimal network activity. This means that my BeliefNode stores:
The LinkMatrix can be constructed with a number of different algorithms, depending on the nature of the relationship between the nodes. All of the models that you're describing would just be different classes that you'd employ. Probably the easiest thing to do is default to an or-gate, and then choose other ways to handle the LinkMatrix if we have a special relationship between the nodes.
I use MongoDB for persistence and caching. I access this data inside of an evented model for speed and asynchronous access. This makes the network fairly performant while also having the opportunity to be very large if it needs to be. Also, since I'm using Mongo in this way, I can easily create a new context for the same knowledge base. So, for example, if I have a diagnostic tree, some of the diagnostic support for a diagnosis will come from a patient's symptoms and tests. What I do is create a context for that patient and then propagate my beliefs based on the evidence from that particular patient. Likewise, if a doctor said that a patient was likely experiencing two or more diseases, then I could change some of my link matrices to propagate the belief updates differently.
If you don't want to use something like Mongo for your system, but you are planning on having more than one consumer working on the knowledge base, you will need to adopt some sort of caching system to make sure that you are working on freshly-updated nodes at all times.
My work is open source, so you can follow along if you'd like. It's all Ruby, so it would be similar to your Python, but not necessarily a drop-in replacement. One thing that I like about my design is that all of the information needed for humans to interpret the results can be found in the nodes themselves, rather than in the code. This can be done in the qualitative descriptions, or in the structure of the network.
So, here are some important differences I have with your design:
One big caveat: some of what I'm talking about hasn't been released yet. I worked on the stuff I am talking about until about 2:00 this morning, so it's definitely current and definitely getting regular attention from me, but isn't all available to the public just yet. Since this is a passion of mine, I'd be happy to answer any questions or work together on a project if you'd like.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With