Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does MySQL use latin1_swedish_ci as the default?

Tags:

mysql

encoding

Does anyone know why latin1_swedish is the default for MySQL. It would seem to me that UTF-8 would be more compatible right?

Defaults are usually chosen because they are the best universal choice, but in this case it does not seem thats what they did.

like image 937
Metropolis Avatar asked Oct 14 '10 17:10

Metropolis


People also ask

What is latin1_swedish_ci in MySQL?

The default character set for MySQL at (mt) Media Temple is latin1, with a default collation of latin1_swedish_ci. This is a common type of encoding for Latin characters. You can also change the encoding. utf8 is a common character set for non-Latin characters.

What is the default collation for MySQL?

The default MySQL server character set and collation are latin1 and latin1_swedish_ci , but you can specify character sets at the server, database, table, column, and string literal levels.

What is collation in MySQL why it is used?

A collation is a set of rules that defines how to compare and sort character strings. Each collation in MySQL belongs to a single character set. Every character set has at least one collation, and most have two or more collations. A collation orders characters based on weights.


2 Answers

As far as I can see, latin1 was the default character set in pre-multibyte times and it looks like that's been continued, probably for reasons of downward compatibility (e.g. for older CREATE statements that didn't specify a collation).

From here:

What 4.0 Did

MySQL 4.0 (and earlier versions) only supported what amounted to a combined notion of the character set and collation with single-byte character encodings, which was specified at the server level. The default was latin1, which corresponds to a character set of latin1 and collation of latin1_swedish_ci in MySQL 4.1.

As to why swedish, I can only guess that it's because MySQL AB is/was swedish. I can't see any other reason for choosing this collation, it comes with some specific sorting quirks (ÄÖÜ come after Z I think) but they are nowhere near an international standard.

like image 67
Pekka Avatar answered Sep 17 '22 19:09

Pekka


latin1 is the default character set. MySQL's latin1 is the same as the Windows cp1252 character set. This means it is the same as the official ISO 8859-1 or IANA (Internet Assigned Numbers Authority) latin1, except that IANA latin1 treats the code points between 0x80 and 0x9f as “undefined,” whereas cp1252, and therefore MySQL's latin1, assign characters for those positions.

from

http://dev.mysql.com/doc/refman/5.0/en/charset-we-sets.html

Might help you understand why.

like image 45
bear Avatar answered Sep 19 '22 19:09

bear