Having a fully internationalised application is a necessity if you want to sell worldwide. In Java we're using resource bundles and that solves things for static text codeside.
But what do you do about text that is stored in the database? Starting with static definitions, to user modifiable objects, ending with user entered data.
Assuming you have a database used by users with different Locales - how do you handle this? How far do you internationalise? Where do you draw the line? What workaround can keep users from receiving text in a language they don't understand?
THERE ARE FOUR METHODS THAT ARE USED TO LOCALIZE DATABASES: Row localization – copies the original row for each language. Field localization – updates the values of the localized fields. Table localization – adds new language tables for each table. Database cloning – creates a copy of the database for each language.
Single Translation Table Approach. This solution seems to be the cleanest one from the database structure perspective. You store all texts that need to be translated into a single translation table.
Localization refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market (a locale). Localization is sometimes written in English as l10n, where 10 is the number of letters in the English word between l and n.
Don't store system generated text in the database. Instead, store a code (like a message number) and then internationalize it at the GUI level. Make sure that the only text that comes directly out of the database is text that the user put in themselves. Make sure your database is set to accept unicode text.
Firstly, be very aware of the limitations. For user-created content, you're looking at community translation (erratic), machine translation (unreliable) or paying human translators (expensive!) if you want to localize stuff that your users are entering into your application. You may want to ask your users to provide two version - one for your default culture (English?) and one for their localized culture, so you can provide a fall-back translation for other users?
Second, be prepared for some extremely lengthy database migrations... if you've got four columns of text in an Excel spreadsheet, suddenly you're dealing with inserting each value into your translation system, retrieving the localized ID, and then storing that in the table you're actually importing - and SELECT *
will only give you phrase IDs, which you need to resolve back into strings by localizing them against your translation tables.
That said - you can localize lots of the lookup tables, drop-down lists, etc. that are driven by the database in a typical project. Other comments have already mentioned storing StringId values in the database that refer to external resource files or spreadsheets, but if you're interested in holding ALL your localized text in the database alongside the data itself, then you might find this approach useful.
We've used a table called Phrase, which contains the ID and default (English) content for every piece of text in your application.
Your other tables end up looking like this:
CREATE TABLE ProductType (
Id int primary key,
NamePhraseId int, -- link to the Phrase containing the name of this product type.
DescriptionPhraseId int
)
Create a second table Culture, which contains the specific and neutral cultures you're supporting. For bonus points, implement this table as a self-referential tree (each Culture record contains a nullable ParentCultureCode reference), so you can fall-back from specific cultures ("fr-CA" for Canadian French) to neutral cultures ("fr" if no regional localization exists), to your invariant / default culture (normally 'en' because it's so widely spoken)
Your actual translations are in a LocalizedPhrase table, that looks like:
CREATE TABLE LocalizedPhrase (
PhraseId int primary key,
CultureCode varchar(8) primary key,
Content nvarchar(255) -- the actual localized content
)
You can extend this model if you want to provide male/female-specific localizations:
CREATE TABLE GenderedLocalizedPhrase (
PhraseId int primary key,
CultureCode varchar(8) primary key,
GenderCode char(1) primary key, -- 'm', 'f' or '?' - links to Gender table
Content nvarchar(255)
)
You will want to cache this entire table graph in memory and modify your query/join strategies accordingly - caching the localizations inside Phrase classes and overriding the ToString() method on the Phrase object to inspect the current thread culture is one approach. If you try and do this stuff inside your queries, you'll incur a substantial performance cost and every query will end up looking like this:
-- assume @MyCulture contains the culture code ('ca-FR') that we are looking for:
SELECT
Product.Id,
Product.Name,
COALESCE(ProductStatusLocalizedPhrase.Content, ProductStatusPhrase.Content) as ProductStatus,
COALESCE(ProductTypeLocalizedPhrase.Content, ProductTypePhrase.Content) as ProductType,
FROM Product
INNER JOIN ProductStatus ON Product.StatusId = ProductStatus.Id
INNER JOIN Phrase as ProductStatusPhrase ON ProductStatus.NamePhraseId = Phrase.Id
LEFT JOIN LocalizedPhrase as ProductStatusLocalizedPhrase
ON ProductStatus.NamePhraseId = ProductStatusLocalizedPhrase.Id and CultureCode = @MyCulture
INNER JOIN ProductType ON Product.TypeId = ProductType.Id
INNER JOIN Phrase as ProductTypePhrase ON ProductType.NamePhraseId = Phrase.Id
LEFT JOIN LocalizedPhrase as ProductTypeLocalizedPhrase
ON ProductType.NamePhraseId = ProductTypeLocalizedPhrase.Id and CultureCode = @MyCulture
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With