Internationalizing web apps always seems to be a chore. No matter how much you plan ahead for pluggable languages, there's always issues with encoding, funky phrasing that doesn't fit your templates, and other problems.
I think it would be useful to get the SO community's input for a set of things that programmers should look out for when deciding to internationalize their web apps.
What is a best practice in web application?
Keep the code concise. Code readability is one of the most important web app development best practices. It helps not only to easily develop and maintain software but also to prevent legacy code issues and reduce technical debt.
What is localization in web application?
Website localization is the process of adapting an existing website to local language and culture in the target market. It is the process of adapting a website into a different linguistic and cultural context— involving much more than the simple translation of text.
Internationalization is hard, here's a few things I've learned from working with 2 websites that were in over 20 different languages:
- Use UTF-8 everywhere. No exceptions. HTML, server-side language (watch out for PHP especially), database, etc.
- No text in images unless you want a ton of work. Use CSS to put text over images if necessary.
- Separate configuration from localization. That way localizers can translate the text and you can deal with different configurations per locale (features, layout, etc). You don't want localizers to have the ability to mess with your app.
- Make sure your layouts can deal with text that is 2-3 times longer than English. And also 50% less than English (Japanese and Chinese are often shorter).
- Some languages need larger font sizes (Japanese, Chinese)
- Colors are locale-specific also. Red and green don't mean the same thing everywhere!
- Add a classname that is the locale name to the body tag of your documents. That way you can specify a specific locale's layout in your CSS file easily.
- Watch out for variable substitution. Don't split your strings. Leave them whole like this: "You have X new messages" and replace the 'X' with the #.
- Different languages have different pluralization. 0, 1, 2-4, 5-7, 7-infinity. Hard to deal with.
- Context is difficult. Sometimes localizers need to know where/how a string is used to make sure it's translated correctly.
Resources:
- http://interglacial.com/~sburke/tpj/as_html/tpj13.html
- http://www.ryandoherty.net/2008/05/26/quick-tips-for-localizing-web-apps/
- http://ed.agadak.net/2007/12/one-potato-two-potato-three-potato-four