I need to extract messages from .po
files. Is there a Python module to do that? I wrote a parser, but it depends on platform (\r\n
vs. \n
).
Is there a better way to do this?
In most cases you don't need to parse .po
files yourself. Developers give translators a .pot
template file, they rename it to xx_XX.po
and translate the strings. Then you as developer only have to "compile" them to .mo
files using GNU's gettext
tools (or its Python implementation, pygettext
)
But, if you want/need to parse the po files yourself, instead of compiling them, I strongly suggest you to use polib
, a well-known python library to handle po
files. It is used by several large-scale projects, such as Mercurial and Ubuntu's Launchpad translation engine:
PyPi package home: http://pypi.python.org/pypi/polib/
Code repository: https://github.com/izimobil/polib
(Original repository was hosted at Bitbucket, which no longer supports Mercurial: https://bitbucket.org/izi/polib/wiki/Home)
Documentation: http://polib.readthedocs.org
The import module is a single file, with MIT license, so you can easily incorporate it in your code like this:
import polib
po = polib.pofile('path/to/catalog.po')
for entry in po:
print entry.msgid, entry.msgstr
It can't be easier than that ;)
Babel includes a .po files parser written in Python:
http://babel.edgewall.org/
The built-in gettext module works only with binary .mo files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With