I am trying to figure out how to properly make a discrete state Markov chain model with pymc
.
As an example (view in nbviewer), lets make a chain of length T=10 where the Markov state is binary, the initial state distribution is [0.2, 0.8] and that the probability of switching states in state 1 is 0.01 while in state 2 it is 0.5
import numpy as np
import pymc as pm
T = 10
prior0 = [0.2, 0.8]
transMat = [[0.99, 0.01], [0.5, 0.5]]
To make the model, I make an array of state variables and an array of transition probabilities that depend on the state variables (using the pymc.Index function)
states = np.empty(T, dtype=object)
states[0] = pm.Categorical('state_0', prior0)
transPs = np.empty(T, dtype=object)
transPs[0] = pm.Index('trans_0', transMat, states[0])
for i in range(1, T):
states[i] = pm.Categorical('state_%i' % i, transPs[i-1])
transPs[i] = pm.Index('trans_%i' %i, transMat, states[i])
Sampling the model shows that the states marginals are what they should be (compared with model built with Kevin Murphy's BNT package in Matlab)
model = pm.MCMC([states, transPs])
model.sample(10000, 5000)
[np.mean(model.trace('state_%i' %i)[:]) for i in range(T)]
prints out:
[-----------------100%-----------------] 10000 of 10000 complete in 7.5 sec
[0.80020000000000002,
0.39839999999999998,
0.20319999999999999,
0.1118,
0.064199999999999993,
0.044600000000000001,
0.033000000000000002,
0.026200000000000001,
0.024199999999999999,
0.023800000000000002]
My question is - this does not seem like the most elegant way to build a Markov chain with pymc. Is there a cleaner way that does not require the array of deterministic functions?
My goal is to write a pymc based package for more general dynamic Bayesian networks.
A Markov Chain is a discrete stochastic process with the Markov property : P(Xt|Xt−1,…,X1)=P(Xt|Xt−1). It is fully determined by a probability transition matrix P which defines the transition probabilities (Pij=P(Xt=j|Xt−1=i) and an initial probability distribution specified by the vector x where xi=P(X0=i).
For NLP, a Markov chain can be used to generate a sequence of words that form a complete sentence, or a hidden Markov model can be used for named-entity recognition and tagging parts of speech. For machine learning, Markov decision processes are used to represent reward in reinforcement learning.
Hidden Markov Models (HMMs) are a class of probabilistic graphical model that allow us to predict a sequence of unknown (hidden) variables from a set of observed variables. A simple example of an HMM is predicting the weather (hidden variable) based on the type of clothes that someone wears (observed).
As far as I know you have to encode the distribution of each time step as a deterministic function of the previous time step, because that's what it is--there's no randomness involved in the parameters because you defined them in the problem set-up. However, I think you're question may have been more towards finding a more intuitive way to represent the model. One alternative way would be to directly encode the time step transitions as a function of the previous time step.
from pymc import Bernoulli, MCMC
def generate_timesteps(N,p_init,p_trans):
timesteps=np.empty(N,dtype=object)
# A success denotes being in state 2, a failure being in state 1
timesteps[0]=Bernoulli('T0',p_init)
for i in xrange(1,N):
# probability of being in state 1 at time step `i` given time step `i-1`
p_i = p_trans[1]*timesteps[i-1]+p_trans[0]*(1-timesteps[i-1])
timesteps[i] = Bernoulli('T%d'%i,p_i)
return timesteps
timesteps = generate_timesteps(10,0.8,[0.001,0.5])
model = MCMC(timesteps)
model.sample(10000) # no burn in necessary since we're sampling directly from the distribution
[np.mean( model.trace(t).gettrace() ) for t in timesteps]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With