I'm implementing the forward algorithm for HMMs to calculate the probability of a given HMM emitting a given observation sequence. I'd like my algorithm to be robust to underflow. I can't work in log-space because the forward algorithm requires the multiplication AND addition of probabilities. What is the best way to avoid underflow?
I've read some sources about this but the best suggestion I get is scaling the probabilities at each time step Section 6 Here. By the end of the algorithm you won't be left with the exact probability you want (of the observation sequence). Also, unless I'm mistaken, if you scale probabilities at each time step as proposed in the above reference, you can't do a meaningful comparison of the probability of a given observation sequence having come from two different HMMs (to figure out which one is more likely to have output the sequence). Any suggestions?
In equation 32 at the end of your reference you multiply every probability value alpha_t(i) by C_t. So at the end you have multiplied your final probabilities by the product of all the C_t. You can keep track of all of this by keeping track of the sum of log(C_t). Then at the end you can work out log(alpha_t(i)) - SUM_(j <= t)log(C_j) which will give you the log probability of the final alpha_t(i), or log(SUM_t alpha_t(i)) - SUM_(j <= t)log(C_j) which will give you the log probability of the entire sequence.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With