Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explain markov-chain algorithm in layman's terms

Tags:

I don't quite understand this Markov... it takes two words a prefix and suffix saves up a list of them and makes random word?

    /* Copyright (C) 1999 Lucent Technologies */ /* Excerpted from 'The Practice of Programming' */ /* by Brian W. Kernighan and Rob Pike */  #include <time.h> #include <iostream> #include <string> #include <deque> #include <map> #include <vector>  using namespace std;  const int  NPREF = 2; const char NONWORD[] = "\n";    // cannot appear as real line: we remove newlines const int  MAXGEN = 10000; // maximum words generated  typedef deque<string> Prefix;  map<Prefix, vector<string> > statetab; // prefix -> suffixes  void        build(Prefix&, istream&); void        generate(int nwords); void        add(Prefix&, const string&);  // markov main: markov-chain random text generation int main(void) {     int nwords = MAXGEN;     Prefix prefix;  // current input prefix      srand(time(NULL));     for (int i = 0; i < NPREF; i++)         add(prefix, NONWORD);     build(prefix, cin);     add(prefix, NONWORD);     generate(nwords);     return 0; }  // build: read input words, build state table void build(Prefix& prefix, istream& in) {     string buf;      while (in >> buf)         add(prefix, buf); }  // add: add word to suffix deque, update prefix void add(Prefix& prefix, const string& s) {     if (prefix.size() == NPREF) {         statetab[prefix].push_back(s);         prefix.pop_front();     }     prefix.push_back(s); }  // generate: produce output, one word per line void generate(int nwords) {     Prefix prefix;     int i;      for (i = 0; i < NPREF; i++)         add(prefix, NONWORD);     for (i = 0; i < nwords; i++) {         vector<string>& suf = statetab[prefix];         const string& w = suf[rand() % suf.size()];         if (w == NONWORD)             break;         cout << w << "\n";         prefix.pop_front(); // advance         prefix.push_back(w);     } } 
like image 594
Takafu Keyomama Avatar asked Nov 02 '10 20:11

Takafu Keyomama


People also ask

What is Markov chain in simple words?

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now."

What is Markov chain algorithm?

Markov chain is a systematic method for generating a sequence of random variables where the current value is probabilistically dependent on the value of the prior variable. Specifically, selecting the next variable is only dependent upon the last variable in the chain.

What is Markov chain explain with example?

A Markov chain is a mathematical process that transitions from one state to another within a finite number of possible states. It is a collection of different states and probabilities of a variable, where its future condition or state is substantially dependent on its immediate previous state.

What is Markov process write the algorithm of Markov process explain each step?

A Markov Decision Process (MDP) model contains: A set of possible world states S. A set of Models. A set of possible actions A. A real-valued reward function R(s,a).


1 Answers

According to Wikipedia, a Markov Chain is a random process where the next state is dependent on the previous state. This is a little difficult to understand, so I'll try to explain it better:

What you're looking at, seems to be a program that generates a text-based Markov Chain. Essentially the algorithm for that is as follows:

  1. Split a body of text into tokens (words, punctuation).
  2. Build a frequency table. This is a data structure where for every word in your body of text, you have an entry (key). This key is mapped to another data structure that is basically a list of all the words that follow this word (the key) along with its frequency.
  3. Generate the Markov Chain. To do this, you select a starting point (a key from your frequency table) and then you randomly select another state to go to (the next word). The next word you choose, is dependent on its frequency (so some words are more probable than others). After that, you use this new word as the key and start over.

For example, if you look at the very first sentence of this solution, you can come up with the following frequency table:

According: to(100%) to:        Wikipedia(100%) Wikipedia: ,(100%) a:         Markov(50%), random(50%) Markov:    Chain(100%) Chain:     is(100%) is:        a(33%), dependent(33%), ...(33%) random:    process(100%) process:   with(100%) . . . better:    :(100%) 

Essentially, the state transition from one state to another is probability based. In the case of a text-based Markov Chain, the transition probability is based on the frequency of words following the selected word. So the selected word represents the previous state and the frequency table or words represents the (possible) successive states. You find the successive state if you know the previous state (that's the only way you get the right frequency table), so this fits in with the definition where the successive state is dependent on the previous state.

Shameless Plug - I wrote a program to do just this in Perl, some time ago. You can read about it here.

like image 126
Vivin Paliath Avatar answered Oct 31 '22 00:10

Vivin Paliath