Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I migrate a MySQL database with a latin1_swedish_ci collation to utf-8 and, if so, how?

The MySQL database used by my Rails application currently has the default collation of latin1_swedish_ci. Since the default charset of Rails applications (including mine) is UTF-8, it seems sensible to me to use the utf8_general_ci collation in the database.

Is my thinking correct?

Assuming it is, what would be the best approach to migrate the collation and all the data in the database to the new encoding?

like image 678
Olly Avatar asked Oct 13 '08 11:10

Olly


2 Answers

UTF-8, as well as any other Unicode encoding scheme, can store characters in any language, so it is an excellent choice of codepage for your database.

The collation setting, on the other hand, is a completely separate issue from the encoding scheme. It involves sort orders, upper/lowercase conversions, string equality comparisons, and things like that which are language-specific. The collation setting should match the language that is used in the database.

The UTF-8 general collation is (I am assuming here—I'm not familiar with MySQL in particular) used for situations where the language is unknown and some simple default ordering is needed. It probably corresponds to the Unicode code point ordering, which is almost certainly not what you want if you're storing Swedish.

like image 130
Jeffrey L Whitledge Avatar answered Oct 16 '22 20:10

Jeffrey L Whitledge


Convert to UTF-8 as the charset.

Collation settings are only used for sorting and stuff like that. Choose the collation that most of your users would expect.

like image 1
Christoph Schiessl Avatar answered Oct 16 '22 20:10

Christoph Schiessl