Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

l10n/i18n: how to handle phrases with dynamic list of items?

What's the sanest way to handle translation and localization of dynamic lists?

Let's say I've queried the database, and got a list ["Foos", "Bars", "Bazes"]. Let's also assume the list always contain at least two items - I'll be sure to use a different translation for the single-item case.

What should I do if I need a phrase like "We have a wide choice of Foos, Bars and Bazes in our code"? (assuming that list items are dynamic so I can't just pre-translate all the possible permutations, and need to do things at runtime.)

I see at least the following issues:

  • I need to inflect all the items to the correct form (are there languages where different forms are required depending on the position in the list?)

  • Different locales may have drastically different rules how to join items.

    • E.g. CJK locales need "、" instead of ",".
    • And AFAIK in Chinese there will be "及" or "和" - depending on the full phrase - before the last item, so I guess there's some ambiguity with translating "and".
    • And, as I've read, some languages may avoid punctuation like it's used in English, but have other concepts instead, e.g. Arabic translator may prefer use "و" before every item (although they also have commas, "،"). Not sure if true or not - I don't know Arabic, just saw it mentioned.

My problem is, I don't even know what tooling may help me here. I don't have any particular programming language requirements, although Python or JavaScript would be the best. But I guess I can run just about anything, as I can probably build a l10n microservice and query it from my project.

I've used GNU gettext before I've encountered this, but I haven't found anything that would help me in its APIs and data formats. The best I can imagine is _("We have a wide choice of %s in our code", list_text) and generate list_text using some DIY hacks. I'm not sure XLIFF format has anything like this as well. I've found i18n-list-generator on npm but it's way too simplicistic.

Have anyone dealt with something like this? What did you do? Is there any library out there that handles this - so I can take a look at its API and learn how it does things?

like image 696
drdaeman Avatar asked Aug 21 '17 16:08

drdaeman


2 Answers

Here's how I would approach it:

  1. No concatenation. All string joining needs to be done via format strings with placeholders.

  2. Only use format strings that support named/numbered placeholders. E.g. {FOO} or $1 instead of %s (this is to allow for parameter reordering). Named placeholders are also better since they give more context to translators. Let's assume we're using {FOO}-style placeholders.

  3. To render a list, I would use a couple of format strings, e.g.: joinItem = "{LIST}, {ITEM}" to append items to the list and joinLastItem = "{LIST} and {ITEM}" to append the last item. This will allow one to render strings like Foos, Bars and Bases, change punctuation and even reverse the ordering of the list, if necessary.

  4. Finally, you can use the final format string, e.g. weHaveTheseItems = "We have a wide choice of {ITEMS} in our code", assuming the {ITEMS} gets replaced with the previously rendered string.

Shameless self-promotion: you may want to have a look at the Plurr library that supports such {FOO}-style placeholders, as well as plurals (something you will likely need for such messages). It supports JavaScript among other languages.

like image 126
Igor Afanasyev Avatar answered Nov 10 '22 16:11

Igor Afanasyev


This is a pain, as you point out not all locales can be expected to support the ",,,,and" form.

Inspired by @GSerg and @Igor Afanasyev I came up with a GNU Gettext based solution like the following (pseudo gettext invocation):

GettextPlural(
    // TRANSLATORS: For multiple "choices", each will be prefixed with a new-line (\n)
    "We have a wide choice of {choices} in our code",
    "In our code we have a wide choice of{choices}", choices.Count)

should print like:

"We have a wide choice of FOOs in our code"

"In our code we have a wide choice of
FOOs
BARs
BAZs"

Remember to stick the --add-comments=TRANSLATORS to your xgettext invocation.

For Web purposes you could use <ul><li>...</li>... </ul> or whatever instead of \n.

The benefit is that layout is at least as universal as UI layout, but you are still allowing non-English'ish locale plural forms.

Some languages have only one plural form so their translation must work with both a single choice and multiple choices, so in particular, they cannot have a conditional new-line.

like image 22
Robert Jørgensgaard Engdahl Avatar answered Nov 10 '22 16:11

Robert Jørgensgaard Engdahl