Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you internationalise text in the database?

Having a fully internationalised application is a necessity if you want to sell worldwide. In Java we're using resource bundles and that solves things for static text codeside.

But what do you do about text that is stored in the database? Starting with static definitions, to user modifiable objects, ending with user entered data.

Assuming you have a database used by users with different Locales - how do you handle this? How far do you internationalise? Where do you draw the line? What workaround can keep users from receiving text in a language they don't understand?

like image 556
Stroboskop Avatar asked Nov 24 '08 13:11

Stroboskop


People also ask

What is Localization in SQL?

THERE ARE FOUR METHODS THAT ARE USED TO LOCALIZE DATABASES: Row localization – copies the original row for each language. Field localization – updates the values of the localized fields. Table localization – adds new language tables for each table. Database cloning – creates a copy of the database for each language.

What is the best database design for multi language data?

Single Translation Table Approach. This solution seems to be the cleanest one from the database structure perspective. You store all texts that need to be translated into a single translation table.

What is locale in internationalization?

Localization refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market (a locale). Localization is sometimes written in English as l10n, where 10 is the number of letters in the English word between l and n.


2 Answers

Don't store system generated text in the database. Instead, store a code (like a message number) and then internationalize it at the GUI level. Make sure that the only text that comes directly out of the database is text that the user put in themselves. Make sure your database is set to accept unicode text.

like image 195
Paul Tomblin Avatar answered Sep 28 '22 16:09

Paul Tomblin


Firstly, be very aware of the limitations. For user-created content, you're looking at community translation (erratic), machine translation (unreliable) or paying human translators (expensive!) if you want to localize stuff that your users are entering into your application. You may want to ask your users to provide two version - one for your default culture (English?) and one for their localized culture, so you can provide a fall-back translation for other users?

Second, be prepared for some extremely lengthy database migrations... if you've got four columns of text in an Excel spreadsheet, suddenly you're dealing with inserting each value into your translation system, retrieving the localized ID, and then storing that in the table you're actually importing - and SELECT * will only give you phrase IDs, which you need to resolve back into strings by localizing them against your translation tables.

That said - you can localize lots of the lookup tables, drop-down lists, etc. that are driven by the database in a typical project. Other comments have already mentioned storing StringId values in the database that refer to external resource files or spreadsheets, but if you're interested in holding ALL your localized text in the database alongside the data itself, then you might find this approach useful.

We've used a table called Phrase, which contains the ID and default (English) content for every piece of text in your application.

Your other tables end up looking like this:

CREATE TABLE ProductType (
    Id int primary key,
    NamePhraseId int, -- link to the Phrase containing the name of this product type.
    DescriptionPhraseId int
)

Create a second table Culture, which contains the specific and neutral cultures you're supporting. For bonus points, implement this table as a self-referential tree (each Culture record contains a nullable ParentCultureCode reference), so you can fall-back from specific cultures ("fr-CA" for Canadian French) to neutral cultures ("fr" if no regional localization exists), to your invariant / default culture (normally 'en' because it's so widely spoken)

Your actual translations are in a LocalizedPhrase table, that looks like:

CREATE TABLE LocalizedPhrase (
  PhraseId int primary key,
  CultureCode varchar(8) primary key,
  Content nvarchar(255) -- the actual localized content
)

You can extend this model if you want to provide male/female-specific localizations:

CREATE TABLE GenderedLocalizedPhrase (
  PhraseId  int primary key,
  CultureCode varchar(8) primary key,
  GenderCode char(1) primary key, -- 'm', 'f' or '?' - links to Gender table
  Content nvarchar(255)
)

You will want to cache this entire table graph in memory and modify your query/join strategies accordingly - caching the localizations inside Phrase classes and overriding the ToString() method on the Phrase object to inspect the current thread culture is one approach. If you try and do this stuff inside your queries, you'll incur a substantial performance cost and every query will end up looking like this:

-- assume @MyCulture contains the culture code ('ca-FR') that we are looking for:
SELECT 
    Product.Id, 
    Product.Name, 

    COALESCE(ProductStatusLocalizedPhrase.Content, ProductStatusPhrase.Content) as ProductStatus, 
    COALESCE(ProductTypeLocalizedPhrase.Content, ProductTypePhrase.Content) as ProductType, 
  FROM Product

    INNER JOIN ProductStatus ON Product.StatusId = ProductStatus.Id
    INNER JOIN Phrase as ProductStatusPhrase ON ProductStatus.NamePhraseId = Phrase.Id
    LEFT JOIN LocalizedPhrase as ProductStatusLocalizedPhrase 
      ON ProductStatus.NamePhraseId = ProductStatusLocalizedPhrase.Id and CultureCode = @MyCulture

    INNER JOIN ProductType ON Product.TypeId = ProductType.Id
    INNER JOIN Phrase as ProductTypePhrase ON ProductType.NamePhraseId = Phrase.Id
    LEFT JOIN LocalizedPhrase as ProductTypeLocalizedPhrase 
      ON ProductType.NamePhraseId = ProductTypeLocalizedPhrase.Id and CultureCode = @MyCulture
like image 42
Dylan Beattie Avatar answered Sep 28 '22 16:09

Dylan Beattie