Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently work with gettext PO files when making small edits to large text values

Looking for tips and/or tools on how to efficiently work with gettext PO files when making small edits to large msgid values.

Example: We have lots of multi-sentence/multi-paragraph messages that are stored in our PO message catalog files. If we make a very minor change to a message, perhaps editing a single sentence or even correcting punctuation, we lose our original translation when we run the msgmerge utility.

Rather than re-translate long messages (that have already gone through an editorial approval process) from scratch, our translators return to backup copies of their PO files and manually search for the text of the last msgid/msgstr translation pair which they then diff against the current msgid values to see what has changed, followed by a copy and paste of the last translation which they then edit to reflect the updated msgid value.

That's a lot of work! Certainly there must be a better way of handling this type of workflow?

Is there a best practice way to archive and find previous translations that are no longer in a PO file? One idea that comes to mind is to store a unique msg id in the text of our messages or in the comments that precede our message and use this id to retrieve previous msgid/msgstr translation pairs for review. Or are there PO editors or online services that make this process more efficient?

Thank you, Malcolm

like image 760
Malcolm Avatar asked Jun 03 '10 12:06

Malcolm


People also ask

How does gettext () work?

The Full Gettext Process Gettext works by, first, generating a template file with all the strings to be translated directly extracted from the source files, this template file is called a . pot file which stands for Portable Object Template.

What is .po format?

A . PO file is a portable object file, which is text-based. These types of files are used in commonly in software development. The . PO file may be referenced by Java programs, GNU gettext, or other software programs as a properties file.

How to comment. PO file?

Outside strings, white lines and comments may be used freely. Comments start at the beginning of a line with ' # ' and extend until the end of the PO file line. Comments written by translators should have the initial ' # ' immediately followed by some white space.


1 Answers

I've been looking for a way to make minor changes to msgids without disturbing existing translations - for instance, typo fixes in the source text. Here's a recipe I've just worked out that doesn't involve websites:

  1. Use msgen from GNU gettext to generate an English-to-English po file:

    msgen project.pot >corrections.po

  2. Manually edit the msgstrs in "corrections.po" to reflect the typo fixes made in the source text, so we have a mapping from uncorrected to corrected strings. (I haven't thought about how to automate this bit.)

  3. For each "real" translation (for example ca.po): abuse poswap from the Translate Toolkit (translate-toolkit in Ubuntu) to change the msgids:

    poswap -i corrections.po -t ca.po -o ca.new.po

This does seem to lose header comments and obsolete strings from GNU gettext po files, but manually fixing those up is much less work than manually tweaking msgids in each translation (and could probably easily be scripted).

(Obviously, this should only be used in exceptional circumstances, where you're absolutely sure that none of the translators need the opportunity to re-review their translations.)

like image 56
jtn Avatar answered Oct 26 '22 02:10

jtn