Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clojure DAG (Bayesian Network)

I would like to build a Bayesian Network in clojure, since I haven't found any similar project.

I have studied a lot of theory of BN but still I can't see how implement the network (I am not what people call "guru" for anything, but especially not for functional programming).

I do know that a BN is nothing more than a DAG and a lot probability table (one for each node) but now I have no glue how to implement the DAG.

My first idea was a huge set (the DAG) with some little maps (the node of the DAG), every map should have a name (probably a: key) a probability table (another map?) A vector of parents and finally a vector of non-descendant.

Now I don't know how to implement the reference of the parents and non-descendants (what I should put in the two vector). I guess that a pointer should be perfect, but clojure lack of it; I could put in the vector the: name of the other node but it is going to be slow, doesn't it?

I was thinking that instead of a vector I could use more set, in this way would be faster find the descendants of a node.

Similar problem for the probability table where I still need some reference at the other nodes.

Finally I also would like to learn the BN (build the network starting by the data) this means that I will change a lot both probability tables, edge, and nodes.

Should I use mutable types or they would only increment the complexity?

like image 854
Siscia Avatar asked Jul 14 '12 09:07

Siscia


2 Answers

This is not a complete answer, but here is a possible encoding for the example network from the wikipedia article. Each node has a name, a list of successors (children) and a probability table:

(defn node [name children fn]
  {:name name :children children :table fn})

Also, here are little helper functions for building true/false probabilities:

;; builds a true/false probability map
(defn tf [true-prob] #(if % true-prob (- 1.0 true-prob)))

The above function returns a closure, which, when given a true value (resp. false value), returns the probability of the event X=true (for the X probability variable we are encoding).

Since the network is a DAG, we can references directly nodes to each other (exactly like the pointers you mentioned) without having to care about circular references. We just build the graph in topological order:

(let [gw (node "grass wet" [] (fn [& {:keys [sprinkler rain]}]
                            (tf (cond (and sprinkler rain) 0.99
                                      sprinkler 0.9
                                      rain 0.8
                                      :else 0.0))))

  sk (node "sprinkler" [gw]
           (fn [& {:keys [rain]}] (tf (if rain 0.01 0.4))))

  rn (node "rain" [sk gw]
           (constantly (tf 0.2)))]

  (def dag {:nodes {:grass-wet gw :sprinkler sk :rain rn}
        :joint (fn [g s r]
                 (*
                  (((:table gw) :sprinkler s :rain r) g)
                  (((:table sk) :rain r) s)
                  (((:table rn)) r)))}))

The probability table of each node is given as a function of the states of the parent nodes and returns the probability for true and false values. For example,

((:table (:grass-wet dag)) :sprinkler true :rain false)

... returns {:true 0.9, :false 0.09999999999999998}.

The resulting joint function combines probabilities according the this formula:

P(G,S,R) = P(G|S,R).P(S|R).P(R)

And ((:joint dag) true true true) returns 0.0019800000000000004. Indeed, each value returned by ((:table <x>) <args>) is a closure around an if, which returns probability knowing the state of the probability variable. We call each closure with the respective true/false value to extract the appropriate probability, and multiply them.

Here, I am cheating a little because I suppose that the joint function should be computed by traversing the graph (a macro could help, in the general case). This also feels a little messy, notably regarding nodes's states, which are not necessarly only true and false: you would most likely use a map in the general case.

like image 182
coredump Avatar answered Oct 07 '22 11:10

coredump


In general, the way to compute the joint distribution of a BN is

prod( P(node | parents of node) ) 

To achive this, you need a list of nodes where each node contains

  • node name
  • list of parents
  • probability table
  • list of children

probability table maybe is easiest to handle when flat with each row value corresponding to a parent configuration and each column corresponding to a value for the node. This assumes you are using a record to hold all of the values. The value of the node can be contained within the node also.

Nodes with no parents have only one row.

Each row should be normalized after which P(node|parents) = table[row,col]

You don't really need the list of children but having it could make topological sorting easier. A DAG must be capable of being topologically sorted.

The biggest problem arises as the number of cells in the probability table is the product of all of the dimensions of the parents and self. I handled this in C++ using a sparse table using row mapping.

Querying the DAG is a different matter and the best method for doing this depends on size and whether the an approximate answer is sufficient. There isn't enough room to cover them here. Search for Murphy and the Bayes Net Toolbox might be helpful

I realize you are specifically looking for an implementation but, with a little work, you can roll your own.

like image 28
DAV Avatar answered Oct 07 '22 11:10

DAV