Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create Bayesian Network and learn parameters with Python3.x [closed]

I'm searching for the most appropriate tool for python3.x on Windows to create a Bayesian Network, learn its parameters from data and perform the inference.

The network structure I want to define myself as follows: enter image description here

It is taken from this paper.

All the variables are discrete (and can take only 2 possible states) except "Size" and "GraspPose", which are continuous and should be modeled as Mixture of Gaussians.

Authors use Expectation-Maximization algorithm to learn the parameters for conditional probability tables and Junction-Tree algorithm to compute the exact inference.

As I understand all is realised in MatLab with Bayes Net Toolbox by Murphy.

I tried to search something similar in python and here are my results:

  1. Python Bayesian Network Toolbox http://sourceforge.net/projects/pbnt.berlios/ (http://pbnt.berlios.de/). Web-site doesn't work, project doesn't seem to be supported.
  2. BayesPy https://github.com/bayespy/bayespy I think this is what I actually need, but I fail to find some examples similar to my case, to understand how to approach construction of the network structure.
  3. PyMC seems to be a powerful module, but I have problems with importing it on Windows 64, python 3.3. I get error when I install development version

    WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.

UPDATE:

  1. libpgm (http://pythonhosted.org/libpgm/). Exactly what I need, unfortunately not supported by python 3.x
  2. Very interesting actively developing library: PGMPY. Unfortunately continuous variables and learning from data is not supported yet. https://github.com/pgmpy/pgmpy/

Any advices and concrete examples will be highly appreciated.

like image 974
Spu Avatar asked Feb 10 '15 12:02

Spu


2 Answers

It looks like pomegranate was recently updated to include Bayesian Networks. I haven't tried it myself, but the interface looks nice and sklearn-ish.

like image 132
James Atwood Avatar answered Nov 11 '22 08:11

James Atwood


Try the bnlearn library, it contains many functions to learn parameters from data and perform the inference.

pip install bnlearn 

Your use-case would be like this:

# Import the library import bnlearn  # Define the network structure edges = [('task', 'size'),          ('lat var', 'size'),          ('task', 'fill level'),          ('task', 'object shape'),          ('task', 'side graspable'),          ('size', 'GrasPose'),          ('task', 'GrasPose'),          ('fill level', 'GrasPose'),          ('object shape', 'GrasPose'),          ('side graspable', 'GrasPose'),          ('GrasPose', 'latvar'), ]  # Make the actual Bayesian DAG DAG = bnlearn.make_DAG(edges)  # DAG is stored in adjacency matrix print(DAG['adjmat'])  # target           task   size  lat var  ...  side graspable  GrasPose  latvar # source                                 ...                                   # task            False   True    False  ...            True      True   False # size            False  False    False  ...           False      True   False # lat var         False   True    False  ...           False     False   False # fill level      False  False    False  ...           False      True   False # object shape    False  False    False  ...           False      True   False # side graspable  False  False    False  ...           False      True   False # GrasPose        False  False    False  ...           False     False    True # latvar          False  False    False  ...           False     False   False #  # [8 rows x 8 columns]  # No CPDs are in the DAG. Lets see what happens if we print it. bnlearn.print_CPD(DAG) # >[BNLEARN.print_CPD] No CPDs to print. Use bnlearn.plot(DAG) to make a plot.  # Plot DAG. Note that it can be differently orientated if you re-make the plot. bnlearn.plot(DAG) 

Pre-defined DAG

Now we need the data to learn its parameters. Suppose these are stored in your df. The variable names in the data-file must be present in the DAG.

# Read data df = pd.read_csv('path_to_your_data.csv')  # Learn the parameters and store CPDs in the DAG. Use the methodtype your desire. Options are maximumlikelihood or bayes. DAG = bnlearn.parameter_learning.fit(DAG, df, methodtype='maximumlikelihood') # CPDs are present in the DAG at this point. bnlearn.print_CPD(DAG)  # Start making inferences now. As an example: q1 = bnlearn.inference.fit(DAG, variables=['lat var'], evidence={'fill level':1, 'size':0, 'task':1}) 

Below is a working example with a demo dataset (sprinkler). You can play around with this.

# Import example dataset df = bnlearn.import_example('sprinkler') print(df) #      Cloudy  Sprinkler  Rain  Wet_Grass # 0         0          0     0          0 # 1         1          0     1          1 # 2         0          1     0          1 # 3         1          1     1          1 # 4         1          1     1          1 # ..      ...        ...   ...        ... # 995       1          0     1          1 # 996       1          0     1          1 # 997       1          0     1          1 # 998       0          0     0          0 # 999       0          1     1          1  # [1000 rows x 4 columns]   # Define the network structure edges = [('Cloudy', 'Sprinkler'),          ('Cloudy', 'Rain'),          ('Sprinkler', 'Wet_Grass'),          ('Rain', 'Wet_Grass')]  # Make the actual Bayesian DAG DAG = bnlearn.make_DAG(edges) # Print the CPDs bnlearn.print_CPD(DAG) # [BNLEARN.print_CPD] No CPDs to print. Use bnlearn.plot(DAG) to make a plot. # Plot the DAG bnlearn.plot(DAG) 

enter image description here

# Parameter learning on the user-defined DAG and input data DAG = bnlearn.parameter_learning.fit(DAG, df)  # Print the learned CPDs bnlearn.print_CPD(DAG)  # [BNLEARN.print_CPD] Independencies: # (Cloudy _|_ Wet_Grass | Rain, Sprinkler) # (Sprinkler _|_ Rain | Cloudy) # (Rain _|_ Sprinkler | Cloudy) # (Wet_Grass _|_ Cloudy | Rain, Sprinkler) # [BNLEARN.print_CPD] Nodes: ['Cloudy', 'Sprinkler', 'Rain', 'Wet_Grass'] # [BNLEARN.print_CPD] Edges: [('Cloudy', 'Sprinkler'), ('Cloudy', 'Rain'), ('Sprinkler', 'Wet_Grass'), ('Rain', 'Wet_Grass')] # CPD of Cloudy: # +-----------+-------+ # | Cloudy(0) | 0.494 | # +-----------+-------+ # | Cloudy(1) | 0.506 | # +-----------+-------+ # CPD of Sprinkler: # +--------------+--------------------+--------------------+ # | Cloudy       | Cloudy(0)          | Cloudy(1)          | # +--------------+--------------------+--------------------+ # | Sprinkler(0) | 0.4807692307692308 | 0.7075098814229249 | # +--------------+--------------------+--------------------+ # | Sprinkler(1) | 0.5192307692307693 | 0.2924901185770751 | # +--------------+--------------------+--------------------+ # CPD of Rain: # +---------+--------------------+---------------------+ # | Cloudy  | Cloudy(0)          | Cloudy(1)           | # +---------+--------------------+---------------------+ # | Rain(0) | 0.6518218623481782 | 0.33695652173913043 | # +---------+--------------------+---------------------+ # | Rain(1) | 0.3481781376518219 | 0.6630434782608695  | # +---------+--------------------+---------------------+ # CPD of Wet_Grass: # +--------------+--------------------+---------------------+---------------------+---------------------+ # | Rain         | Rain(0)            | Rain(0)             | Rain(1)             | Rain(1)             | # +--------------+--------------------+---------------------+---------------------+---------------------+ # | Sprinkler    | Sprinkler(0)       | Sprinkler(1)        | Sprinkler(0)        | Sprinkler(1)        | # +--------------+--------------------+---------------------+---------------------+---------------------+ # | Wet_Grass(0) | 0.7553816046966731 | 0.33755274261603374 | 0.25588235294117645 | 0.37910447761194027 | # +--------------+--------------------+---------------------+---------------------+---------------------+ # | Wet_Grass(1) | 0.2446183953033268 | 0.6624472573839663  | 0.7441176470588236  | 0.6208955223880597  | # +--------------+--------------------+---------------------+---------------------+---------------------+  # Make inference q1 = bnlearn.inference.fit(DAG, variables=['Wet_Grass'], evidence={'Rain':1, 'Sprinkler':0, 'Cloudy':1})  # +--------------+------------------+ # | Wet_Grass    |   phi(Wet_Grass) | # +==============+==================+ # | Wet_Grass(0) |           0.2559 | # +--------------+------------------+ # | Wet_Grass(1) |           0.7441 | # +--------------+------------------+  print(q1.values) # array([0.25588235, 0.74411765]) 

More examples can be found on documentation the pages of bnlearn or read the blog.

like image 45
erdogant Avatar answered Nov 11 '22 07:11

erdogant