This is kinda a general question, open for opinions. I've been trying to come up with a good way to design for localization of string resources for a Windows MFC application and related utilities. My wishlist is: <ul> <li>Must preserve string literals in code (as opposed to replacing with macro #define resource ID's), so that the messages are still readable inline</li> <li>Must allow localized string resources (duh)</li> <li>Must not impose additional run-time environment restrictions (eg: dependency on .NET, etc.)</li> <li>Should have minimal obtrusion into existing code (the less modification the better)</li> <li>Should be debuggable</li> <li>Should generate resource files which are editable by common tools (ie: common format)</li> <li>Should not use copy/paste comment blocks to preserve literal strings in code, or anything else which creates the potential for de-synchronization</li> <li>Would be nice to allow static (compile-time) checking that every "notated" string is in the resource file(s)</li> <li>Would be nice to allow cross-language resource string pooling (for components in various languages, eg: native C++ and .NET)</li> </ul> I have a way which fulfills all my wishlist to some extent except for static checking, but I have had to develop a bit of custom code to achieve it (and it has limitations). I'm wondering if anyone has solved this problem in a particularly good way. Edit: The solution I currently have looks like this: <pre class="prettyprint"><code>ShowMessage( RESTRING( _T("Some string") ) ); ShowMessage( RESTRING( _T("Some string with variable %1"), sNonTranslatedStringVariable ) ); </code></pre> I then have a custom utility to parse out the strings from within the 'RESTRING' blocks and put them into a .resx file for localization, and a separate C# COM object to load them from localized resource files with fallback. If the C# object is not available (or cannot load), I fallback to the string in the code. The macro expands to a template class which calls the COM object and does the formatting, etc. Anyway, I thought it would be useful to add what I have now for reference.

We use the English string as the ID. If it fails the look up from the international resource object (loaded from the I18N dll installed) then we default to the ID string. Code looks like: <pre class="prettyprint"><code>doAction(I18N.get("Press OK to continue")); </code></pre> As part of the build processes we have a perl script that parses all source for string constants. It builds a temp file of all strings in the application and then compares these against the resource strings in each local to see if they exists. Any missing strings generates an e-mail to the appropriate translation team. We can have multiple dll for each local. The name of the dll is based on RFC 3066 language[_territory][.codeset][@modifier] We try and extract the locale from the machine and be as specific as possible when loading the I18N dll but fallback to less specific local variations if the more specific version is not present. Example: In the UK: If the local was en_GB.UTF-8 (I use the term dll loosely not in the specific windows sense). First look for the I18N.en_GB.UTF-8 dll. If this dll does not exist fall back to I18N.en_GB. If this dll does not exist fall back to I18N.en If this dll does not exist fall beck to I18N.default The only exception to this rule is: Simplified Chinese (zh_CN) where the fallback is US English (en_US). If the machine does not support simplified Chinese then it is unlikely to support full Chinese.

Best way to design for localization of strings

Tags:

This is kinda a general question, open for opinions. I've been trying to come up with a good way to design for localization of string resources for a Windows MFC application and related utilities. My wishlist is:

Must preserve string literals in code (as opposed to replacing with macro #define resource ID's), so that the messages are still readable inline
Must allow localized string resources (duh)
Must not impose additional run-time environment restrictions (eg: dependency on .NET, etc.)
Should have minimal obtrusion into existing code (the less modification the better)
Should be debuggable
Should generate resource files which are editable by common tools (ie: common format)
Should not use copy/paste comment blocks to preserve literal strings in code, or anything else which creates the potential for de-synchronization
Would be nice to allow static (compile-time) checking that every "notated" string is in the resource file(s)
Would be nice to allow cross-language resource string pooling (for components in various languages, eg: native C++ and .NET)

I have a way which fulfills all my wishlist to some extent except for static checking, but I have had to develop a bit of custom code to achieve it (and it has limitations). I'm wondering if anyone has solved this problem in a particularly good way.

Edit: The solution I currently have looks like this:

ShowMessage( RESTRING( _T("Some string") ) ); ShowMessage( RESTRING( _T("Some string with variable %1"), sNonTranslatedStringVariable ) );

I then have a custom utility to parse out the strings from within the 'RESTRING' blocks and put them into a .resx file for localization, and a separate C# COM object to load them from localized resource files with fallback. If the C# object is not available (or cannot load), I fallback to the string in the code. The macro expands to a template class which calls the COM object and does the formatting, etc.

Anyway, I thought it would be useful to add what I have now for reference.

218

asked Oct 08 '08 23:10

Nick

2 Answers

We use the English string as the ID.

If it fails the look up from the international resource object (loaded from the I18N dll installed) then we default to the ID string.

Code looks like:

doAction(I18N.get("Press OK to continue"));

As part of the build processes we have a perl script that parses all source for string constants. It builds a temp file of all strings in the application and then compares these against the resource strings in each local to see if they exists. Any missing strings generates an e-mail to the appropriate translation team.

We can have multiple dll for each local. The name of the dll is based on RFC 3066
language[_territory][.codeset][@modifier]

We try and extract the locale from the machine and be as specific as possible when loading the I18N dll but fallback to less specific local variations if the more specific version is not present.

Example:

In the UK: If the local was en_GB.UTF-8
(I use the term dll loosely not in the specific windows sense).

First look for the I18N.en_GB.UTF-8 dll. If this dll does not exist fall back to I18N.en_GB. If this dll does not exist fall back to I18N.en If this dll does not exist fall beck to I18N.default

The only exception to this rule is: Simplified Chinese (zh_CN) where the fallback is US English (en_US). If the machine does not support simplified Chinese then it is unlikely to support full Chinese.

answered Sep 22 '22 21:09

Martin York

The simple way is to only use string IDs in your code - no literal strings. You can then produce different versions of the.rc file for each language and either create resource only DLLs or simply different language builds.

There are a couple of shareware utilstohelp localising the rc file which handle resizing dialog elements for languages with longer words and warnign about missing translations.

A more complicated problem is word order, if you have several numbers in a printf which must be in a different order for different language's grammar. There are some extended printf classes on codeproject that let you specify things like printf("word %1s and %2s",var1,var2) so you can switch %1s and %2s if necessary.

answered Sep 24 '22 21:09

Martin Beckett

Related questions
                            
                                wcf deserialize enum as string
                            
                                UnicodeEncodeError when redirecting stdout [duplicate]
                            
                                Working with timezones and daylight savings time in Javascript
                            
                                Browser Autofill and Javascript triggered events
                            
                                mysql insert on duplicate FIELD instead of KEY
                            
                                Passing extra parameters to source using Jquery UI autocomplete [closed]
                            
                                Haskell cabal: I just installed packages, but now the packages are not found
                            
                                Inline assembly that clobbers the red zone
                            
                                Get parameters for currently running queries in PostgreSQL
                            
                                How to retrieve the text of a DOM Text node?
                            
                                Prevent iCloud sync of data (using .nosync?)
                            
                                .NET Date to string gives invalid strings in Vista Pseudo-cultures

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With