Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Negative results using kenlm

I am new to the language modeling and a make a 3grams language model using kenlm(or this) from a large text file (~7gb.). I make a binary file from my language model and call it in python like this:

import kenlm
model = kenlm.LanguageModel(<my .klm file>)
model.score(<my sentence>)

and i get a negative number as the result.and when i change the sentence for scoring, the result remains negative but changes.I give it exactly one of the large text file sentences but it gives me a bad negative number(in comparison with a sentence that does not in the text file) I dont know what does negative result means and how can i convert it to positive and normal result to select the most correct sentece between some sentences.

like image 597
Emad Helmi Avatar asked Oct 25 '25 05:10

Emad Helmi


2 Answers

The final negative number say, -9.585592 is the log probability of the sentence. Since it's the logarithm, you need to compute the 10 to the power of that number, which is around 2.60 x 10-10. Maybe this is the positive number you are looking for.

More info here

like image 103
ankesh pandey Avatar answered Oct 26 '25 19:10

ankesh pandey


To get the corresponding score that is between 0 and 1:

import math
print(math.pow(10,model.score(<my sentence>)))
like image 21
Wei JIANG Avatar answered Oct 26 '25 18:10

Wei JIANG



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!