Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python interface to ARPA files

I'm looking for a pythonic interface to load ARPA files (back-off language models) and use them to evaluate some text, e.g. get its log-probability, perplexity etc.

I don't need to generate the ARPA file in Python, only to use it for querying.

Does anybody have a recommended package? I already saw kenlm and swig-srilm, but the first is very hard to set up in Windows and the second seems un-maintained anymore.

like image 425
Beka Avatar asked May 26 '14 04:05

Beka


2 Answers

I found a nice under-development package called pynlpl which does exactly what i need, with very few dependencies (libxml2 is about enough), and it gives a pure pythonic implementation to ARPA files

like image 190
Beka Avatar answered Oct 17 '22 16:10

Beka


What about the ARPA package?

It's rather lightweight. Its APIs are also quite intuitive and easy to learn. Although it's not as fast as kenlm, you may still wanna give it a try.

https://pypi.org/project/arpa/

like image 39
Magz Avatar answered Oct 17 '22 17:10

Magz