Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best database structure to keep multilingual data? [duplicate]

People also ask

What are best practices for multi language database design?

For an application and its database to be truly multi-lingual, all texts should have a translation in each supported language – not just the text data in a particular table. This is achieved with a translation subschema where all data with textual content that can reach the user's eyes is stored.

What is multilingual database?

Building a database ready for internationalization means designing a database that can store multilingual data. In other words, the backend should be able to provide data in multiple languages. To do this, the backend should connect and retrieve this data from a multi-language database.

How does MySQL store multilingual data?

You can insert any language text in MySQL Table by changing the Collation of the table Field to 'utf8_general_ci '. It is case insensitive.

How does SQL Server store multilingual data?

Column that you are going to use to store multilingual data, must be of unicode data type such as NVARCHAR, NCHAR, NTEXT. When a multilingual data is inserted into column from table, column value should be prefixed with N'.


Similar to method 3:

[languages]
id (int PK)
code (varchar)

[products]
id (int PK)
neutral_fields (mixed)

[products_t]
id (int FK)
language (int FK)
translated_fields (mixed)
PRIMARY KEY: id,language

So for each table, make another table (in my case with "_t" suffix) which holds the translated fields. When you SELECT * FROM products, simply ... LEFT JOIN products_t ON products_t.id = products.id AND products_t.language = CURRENT_LANGUAGE.

Not that hard, and keeps you free from headaches.


Your third example is actually the way the problem is usually solved. Hard, but doable.

Remove the reference to product from the translation table and put a reference to translation where you need it (the other way around).

[ products ]
id (INT)
price (DECIMAL)
title_translation_id (INT, FK)

[ translation ]
id (INT, PK)
neutral_text (VARCHAR)
-- other properties that may be useful (date, creator etc.)

[ translation_text ]
translation_id (INT, FK)
language_id (INT, FK) 
text (VARCHAR)

As an alternative (not especially a good one) you can have one single field and keep all translations there merged together (as XML, for example).

<translation>
  <en>Supplier</en>
  <de>Lieferant</de>
  <fr>Fournisseur</fr>
</translation>

In order to reduce the number of JOIN's, you could keep separate the translated and non translated in 2 separate tables :

[ products ]
id (INT)
price (DECIMAL)

[ products_i18n ]
id (INT)
name (VARCHAR)
description (VARCHAR)
lang_code (CHAR(5))

At my $DAYJOB we use gettext for I18N. I wrote a plugin to xgettext.pl that extracts all English text from the database tables and add them to the master messages.pot.

It works very well - translators deal with only one file when doing translation - the po file. There's no fiddling with database entries when doing translations.