Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database localization

I have a number of database tables that contain name and description columns which need to be localized. My initial attempt at designing a DB schema that would support this was something like:

product
-------
id
name
description


local_product
-------
id
product_id
local_name
local_description
locale_id


locale
------
id
locale

However, this solution requires a new local_ table for every table that contains name and description columns that require localization. In an attempt to avoid this overhead I redesigned the schema so that only a single localization table is needed

product
-------
id
localization_id


localization    
-------
id    
local_name
local_description
locale_id


locale
------
id
locale

Here's an example of the data which would be stored in this schema when there are 2 tables (product and country) requiring localization:

country

id,     localization_id
-----------------------
1,      5

product

id,     localization_id
-----------------------
1,      2

localization

id,     local_name,   local_description,     locale_id
------------------------------------------------------
2,      apple,        a delicious fruit,     2
2,      pomme,        un fruit délicieux,    3
2,      apfel,        ein köstliches Obst,   4
5,      ireland,      a small country,       2
5,      irlande,      un petite pay,         3

locale

id,     locale
--------------
2,      en
3,      fr
4,      de

Notice that the compound primary key of the localization table is (id, locale_id), but the foreign key in the product table only refers to the first element of this compound PK. This seems like 'a bad thing' from the POV of normalization.

Is there any way I can fix this problem, or alternatively, is there a completely different schema that supports localization without creating a separate table for each localizable table?

Update: A number of respondents have proposed a solution that requires creating a separate table for each localizable table. However, this is precisely what I'm trying to avoid. The schema I've proposed above almost solves the problem to my satisfaction, but I'm unhappy about the fact that the localization_id foreign keys only refer to part of the corresponding primary key in the localization table.

Thanks, Don

like image 870
Dónal Avatar asked Aug 24 '09 14:08

Dónal


People also ask

What is SQL localization?

Localized versions of SQL Server can only be upgraded to localized versions of the same language, and cannot be upgraded to the English-language version. Localized versions of SQL Server can also be installed side by side with English-language instances of SQL Server.

What do you mean by localization?

Localization is the adaptation of a product or service to meet the needs of a particular language, culture or desired population's "look-and-feel." A successfully localized service or product is one that appears to have been developed within the local culture.

What is the difference between translation and localization?

Translation vs localization: what's the difference? Translation is the process of changing your text into another language, but localization is far more wide-reaching. It considers the cultural, visual and technological aspects of changing a site for users in different languages.

How is localization done?

Localization is the process of adapting a piece of content's full meaning for a new region, including translation, associated imagery, and cultural elements that influence how your content will be perceived. Localization is all about making your website feel like it was written with that audience in mind.


2 Answers

I think it's fine. You're describing a one-to-many relationship between a product and its localization text.

I'm wondering if you should also localize the english instead of denormalizing it in your product table.

like image 174
Beth Avatar answered Sep 25 '22 14:09

Beth


I like the idea, but would go a step in the other direction, and have a localization entry for every column that is translated:

country

id,     localization_id
-----------------------
1,      5

product

id,     name_locale_id,  description_locale_id
----------------------------------------------
1,      2,               8

localization

id,     locale_id,    value
------------------------------------------------------
2,      2             apple
2,      3             pomme
2,      4             apfel
5,      2             ireland
5,      3             irlande
8,      2             a delicious fruit
8,      3             un fruit délicieux
8,      4             ein köstliches Obst
9,      2             a small country
9,      3             un petite pay

locale

id,     locale
--------------
2,      en
3,      fr
4,      de

The PK of localization is (id, locale_id). It's no problem that id is also a FK reference in several other tables. You could add a surrogate PK if you want, so long as you still have a unique index on (id, locale_id).

The nice thing about this is it's a single localization table, and it works for any table in your schema, regardless of what fields it has (you're not limited to having both name and description of anything that gets localized). The downside is a potential performance hit when using the localization table -- though potentially you could just cache the whole thing for a given locale_id, so when you're looking up entries you just have to look for the given id (since your cache is keyed based on the language already).

You could also consider leaving in default name and description fields in the product table, which would get used in case an entry is missing for the current language, or when entering, the user didn't specify the language. This would also be the case if you're porting an existing app, you'd already have values there (without locale information).

like image 33
gregmac Avatar answered Sep 25 '22 14:09

gregmac