Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

understanding how to construct a higher order markov chain

Suppose I want to predict if a person is of class1=healthy or of class2= fever. I have a data set with the following domain: {normal,cold,dizzy}

The transition matrix would contain the probability of transition generated from our training dataset while the initial vector would contain the probability that a person starts(day1) with a state x from the domain {normal,cold,dizzy}, again this is also generated from our training set.

If I want to build a first order markov chain, I would generate a 3x3 transition matrix and a 1x3 initial vector per class like so:

> TransitionMatrix
       normal cold dizzy
normal     NA   NA    NA
cold       NA   NA    NA
dizzy      NA   NA    NA

>Initial Vector
     normal cold dizzy
[1,]     NA   NA    NA

The NA will be filled with the corresponding probabilities.

1-My question is about transition matrices in higher order chain. For example in second order MC would we have a transition matrix of size domain²xdomain² like so:

               normal->normal normal->cold normal->dizzy cold->normal cold->cold cold->dizzy dizzy->normal dizzy->cold dizzy->dizzy
normal->normal             NA           NA            NA           NA         NA          NA            NA          NA           NA
normal->cold               NA           NA            NA           NA         NA          NA            NA          NA           NA
normal->dizzy              NA           NA            NA           NA         NA          NA            NA          NA           NA
cold->normal               NA           NA            NA           NA         NA          NA            NA          NA           NA
cold->cold                 NA           NA            NA           NA         NA          NA            NA          NA           NA
cold->dizzy                NA           NA            NA           NA         NA          NA            NA          NA           NA
dizzy->normal              NA           NA            NA           NA         NA          NA            NA          NA           NA
dizzy->cold                NA           NA            NA           NA         NA          NA            NA          NA           NA
dizzy->dizzy               NA           NA            NA           NA         NA          NA            NA          NA           NA

here the cell (1,1) represents the following sequence: normal->normal->normal->normal

or would it instead be just domain²xdomain like so:

               normal cold dizzy
normal->normal     NA   NA    NA
normal->cold       NA   NA    NA
normal->dizzy      NA   NA    NA
cold->normal       NA   NA    NA
cold->cold         NA   NA    NA
cold->dizzy        NA   NA    NA
dizzy->normal      NA   NA    NA
dizzy->cold        NA   NA    NA
dizzy->dizzy       NA   NA    NA

here the cell (1,1) represents normal->normal->normal which is different from the previous representation

2-What about the initial vector for a MC of degree 2. Would we need two initial vectors of size 1xdomain like so:

     normal cold dizzy
[1,]     NA   NA    NA

leading to two initial vectors per class. the first giving the probability of occurrence of {normal,cold,dizzy} on the first day for the healthy/fever class while the second gives the probability of occurrence on the second day for the healthy/fever. this would give 4 initial vectors.

OR would we just need one initial vector of size 1xdomain²like so:

    normal->normal normal->cold normal->dizzy cold->normal cold->cold cold->dizzy dizzy->normal dizzy->cold dizzy->dizzy
[1,]             NA           NA            NA           NA         NA          NA            NA          NA           NA

I can see how the second way of representing the initial vector would be problematic in case we want to classify an observation with only one state.

like image 510
Imlerith Avatar asked Aug 15 '16 10:08

Imlerith


People also ask

What is higher order Markov process?

In this chapter, a higher-order Markov chain model is proposed with estimation methods for the model parameters. The higher-order Markov chain model is then applied to a number of applications such as DNA sequences, sales demand predictions and web page predictions, Newsboy's problem.

How do you explain a Markov chain?

A Markov chain is a mathematical system that experiences transitions from one state to another according to certain probabilistic rules. The defining characteristic of a Markov chain is that no matter how the process arrived at its present state, the possible future states are fixed.

What are two methods of solving Markov chains?

Projection methods are relatively recent and have proved efficient in solving Markov chains. A general projection scheme is presented in this paper along with two methods: Arnoldi's method and the generalized minimal residual's method, GMRES.

What is Markov chain and explain it with detail with an example?

A Markov chain is a stochastic model that uses mathematics to predict the probability of a sequence of events occurring based on the most recent event. A common example of a Markov chain in action is the way Google predicts the next word in your sentence based on your previous entry within Gmail.


1 Answers

Say the set of spaces is S. Typically, in the nth order,

  1. The transition matrix has dimensions |S|n X |S|. This is because given the current n history of states, we need the probability of the single next state. It is true that this single next state induces another compound state of history n, but the transition itself is to the single next state. See this example in Wikipedia, e.g..

  2. The initial distribution is a distribution over |S|n elements (your second option).

like image 139
Ami Tavory Avatar answered Nov 10 '22 00:11

Ami Tavory