Given :
and the following :
For :
q(runs | the, dog) = 0.5
Should this not be 1
as for
q(runs | the, dog)
: xi=runs , xi-2=the , xi-1=dog
Probability is (wi has been swapped for xi):
therefore :
count(the dog runs) / count(the dog) = 1 / 1 = 1
But in above example the value is 0.5 . How is 0.5 arrived at ?
Based on http://files.asimihsan.com/courses/nlp-coursera-2013/notes/nlp.html#markov-processes-part-1
The number 0.5 was not "arrived at" at all; the author just took an arbitrary number for the purpose of illustration.
Any n-gram language model consists of two things: vocabulary and transition probabilities. And the model "does not care" how these probabilities were derived. The only requirement is that the probabilities are self-consistent (that is, for any prefix, the probabilities of all possible continuations sum up to 1). For the model above, it is true: e.g. p(runs|the, dog) + p(STOP|the,dog)=1
.
Of course, in practical applications, we are indeed interested how to "learn" the model parameters from some text corpus. You can calculate that your particular language model can generate the following texts:
the # with 0.5 probability
the dog # with 0.25 probability
the dog runs # with 0.25 probability
From this observation, we can "reverse-engineer" the training corpus: it might have consisted of 4 sentences:
the
the
the dog
the dog runs
If you count all the trigrams in this corpus and normalize the counts, you see that the resulting relative frequencies are equal to the probabilities from your screenshot. In particular, there is 1 sentence which ends after "the dog", and 1 sentence in which "the dog" is followed by "runs". That's how the probability 0.5 (=1/(1+1)
) could have emerged.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With