Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing gettext `.po` files with python

Tags:

python

gettext

I need to extract messages from .po files. Is there a Python module to do that? I wrote a parser, but it depends on platform (\r\n vs. \n).

Is there a better way to do this?

like image 680
alex Avatar asked Mar 06 '12 08:03

alex


2 Answers

In most cases you don't need to parse .po files yourself. Developers give translators a .pot template file, they rename it to xx_XX.po and translate the strings. Then you as developer only have to "compile" them to .mo files using GNU's gettext tools (or its Python implementation, pygettext)

But, if you want/need to parse the po files yourself, instead of compiling them, I strongly suggest you to use polib, a well-known python library to handle po files. It is used by several large-scale projects, such as Mercurial and Ubuntu's Launchpad translation engine:

PyPi package home: http://pypi.python.org/pypi/polib/

Code repository: https://github.com/izimobil/polib

(Original repository was hosted at Bitbucket, which no longer supports Mercurial: https://bitbucket.org/izi/polib/wiki/Home)

Documentation: http://polib.readthedocs.org

The import module is a single file, with MIT license, so you can easily incorporate it in your code like this:

import polib
po = polib.pofile('path/to/catalog.po')
for entry in po:
    print entry.msgid, entry.msgstr

It can't be easier than that ;)

like image 95
MestreLion Avatar answered Sep 22 '22 17:09

MestreLion


Babel includes a .po files parser written in Python:

http://babel.edgewall.org/

The built-in gettext module works only with binary .mo files.

like image 20
yak Avatar answered Sep 23 '22 17:09

yak