Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to design for localization of strings

Tags:

This is kinda a general question, open for opinions. I've been trying to come up with a good way to design for localization of string resources for a Windows MFC application and related utilities. My wishlist is:

  • Must preserve string literals in code (as opposed to replacing with macro #define resource ID's), so that the messages are still readable inline
  • Must allow localized string resources (duh)
  • Must not impose additional run-time environment restrictions (eg: dependency on .NET, etc.)
  • Should have minimal obtrusion into existing code (the less modification the better)
  • Should be debuggable
  • Should generate resource files which are editable by common tools (ie: common format)
  • Should not use copy/paste comment blocks to preserve literal strings in code, or anything else which creates the potential for de-synchronization
  • Would be nice to allow static (compile-time) checking that every "notated" string is in the resource file(s)
  • Would be nice to allow cross-language resource string pooling (for components in various languages, eg: native C++ and .NET)

I have a way which fulfills all my wishlist to some extent except for static checking, but I have had to develop a bit of custom code to achieve it (and it has limitations). I'm wondering if anyone has solved this problem in a particularly good way.

Edit: The solution I currently have looks like this:

ShowMessage( RESTRING( _T("Some string") ) ); ShowMessage( RESTRING( _T("Some string with variable %1"), sNonTranslatedStringVariable ) ); 

I then have a custom utility to parse out the strings from within the 'RESTRING' blocks and put them into a .resx file for localization, and a separate C# COM object to load them from localized resource files with fallback. If the C# object is not available (or cannot load), I fallback to the string in the code. The macro expands to a template class which calls the COM object and does the formatting, etc.

Anyway, I thought it would be useful to add what I have now for reference.

like image 218
Nick Avatar asked Oct 08 '08 23:10

Nick


People also ask

What is string localization?

The Localization IdentifierAn unique identifier is attached to each translated string called the "Localization Identifier". It is used to search the dictionaries and locate the value of the string in different languages.

What is design localization?

Design-stage localization is a powerful way to continuously release fully localized products like mobile apps, web apps, and games. It allows for the creation of designs suitable for multiple languages and bridges the gap between the designers, developers, and translators working on localization.


2 Answers

We use the English string as the ID.

If it fails the look up from the international resource object (loaded from the I18N dll installed) then we default to the ID string.

Code looks like:

doAction(I18N.get("Press OK to continue")); 

As part of the build processes we have a perl script that parses all source for string constants. It builds a temp file of all strings in the application and then compares these against the resource strings in each local to see if they exists. Any missing strings generates an e-mail to the appropriate translation team.

We can have multiple dll for each local. The name of the dll is based on RFC 3066
language[_territory][.codeset][@modifier]

We try and extract the locale from the machine and be as specific as possible when loading the I18N dll but fallback to less specific local variations if the more specific version is not present.

Example:

In the UK: If the local was en_GB.UTF-8
(I use the term dll loosely not in the specific windows sense).

First look for the I18N.en_GB.UTF-8 dll. If this dll does not exist fall back to I18N.en_GB. If this dll does not exist fall back to I18N.en If this dll does not exist fall beck to I18N.default

The only exception to this rule is: Simplified Chinese (zh_CN) where the fallback is US English (en_US). If the machine does not support simplified Chinese then it is unlikely to support full Chinese.

like image 72
Martin York Avatar answered Sep 22 '22 21:09

Martin York


The simple way is to only use string IDs in your code - no literal strings. You can then produce different versions of the.rc file for each language and either create resource only DLLs or simply different language builds.

There are a couple of shareware utilstohelp localising the rc file which handle resizing dialog elements for languages with longer words and warnign about missing translations.

A more complicated problem is word order, if you have several numbers in a printf which must be in a different order for different language's grammar. There are some extended printf classes on codeproject that let you specify things like printf("word %1s and %2s",var1,var2) so you can switch %1s and %2s if necessary.

like image 25
Martin Beckett Avatar answered Sep 24 '22 21:09

Martin Beckett