Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the best practices for multilanguage sites?

I want to make a multi-language site, such that all or almost all pages will be available in 2 or more translations. What are the best practices to follow?

For example, I consider these language selection mechanisms:

  1. Cookie-based selection of the preferred language.
  2. Based on Accept-Language header if the cookie is not set.
  3. Based on GeoIP otherwise (probably).

Is there anything else?

How should different translations be served?

  1. as LANG.example.com/page
  2. as example.com/LANG/page
  3. as example.com/page?hl=LANG
  4. ...
  5. any of the above with a redirect to example.com/page? (It seems to be discouraged)

How to ensure that all the translations are properly indexed?

  1. Sitemaps with all pages + correct Content-Language header are enough?

What is the best way to let the users know there are other translations, but do not distract them?

  1. list available languages in the header/footer/sidebar (like Wikipedia)
  2. put “Choose a language” selector next to the content

What is the best policy to deal with missing/outdated translations?

  1. do not display missing pages at all or display a page in a different language?
  2. display old translation, old translation with a warning or a page in a different language?

What else should I take into account? What should I do and what I definitely should not?

like image 982
sastanin Avatar asked Jan 28 '09 14:01

sastanin


2 Answers

In addition to @Quassnoi's answers ensure that you standard RFC 4646 language identifiers (e.g. EN-US, DE-AT); you may already be aware of this. The CLDR project is an excellent repository of internationalization data (the Supplemental Data is really useful).

If a translation of a specific page is not available, use a language fallback mechanism back to the neutral language; for example "DE-AT", "DE", "" (neutral, e.g. "EN").

Most recent browsers and the underlying operating systems will correctly show all of the characters required for a locale selector list if the page is encoded correctly (I'd recommend all pages being UTF-8). Ensure that the locale list contains both the native and current-language names to allow both native and non-native speakers to view the specified translations, e.g. "Deutsch (German)" if the current locale is EN-*.

A lot of sites use a flag icon to show the current locale, but this is more relevant to the location and some people may be offended if you show only a dominant flag (e.g. the US or UK flag for English).

It may be worthwhile to have a more visible (semi-graphical) locale selector on the home page if no locale cookie has been submitted, using a combination of GeoIP and Accept-Language to determine the default locale choice.

Semi-related: if your users are in located in different time zones include a zone preference in their account profile for displaying time values in their local time. And store all time stamps using UTC.

like image 175
devstuff Avatar answered Sep 28 '22 03:09

devstuff


Make the decision whether you need support for languages that require double byte characters early on (Chinese, Japanese, Korean, etc), Unicode is the preferable choice. It can be tedious to change later, especially if you have a database that doesn't use unicode.

like image 35
Fredriku73 Avatar answered Sep 28 '22 03:09

Fredriku73