Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

utf-8 vs latin1

Tags:

database

mysql

What are the advantages/disadvantages between using utf8 as a charset against using latin1?

If utf can support more chars and is used consistently wouldn't it always be the better choice? Is there any reason to choose latin1?

like image 476
qwertymk Avatar asked Sep 16 '12 18:09

qwertymk


1 Answers

UTF8 Advantages:

  1. Supports most languages, including RTL languages such as Hebrew.

  2. No translation needed when importing/exporting data to UTF8 aware components (JavaScript, Java, etc).

UTF8 Disadvantages:

  1. Non-ASCII characters will take more time to encode and decode, due to their more complex encoding scheme.

  2. Non-ASCII characters will take more space as they may be stored using more than 1 byte (characters not in the first 127 characters of the ASCII characters set). A CHAR(10) or VARCHAR(10) field may need up to 30 bytes to store some UTF8 characters.

  3. Collations other than utf8_bin will be slower as the sort order will not directly map to the character encoding order), and will require translation in some stored procedures (as variables default to utf8_general_ci collation).

  4. If you need to JOIN UTF8 and non-UTF8 fields, MySQL will impose a SEVERE performance hit. What would be sub-second queries could potentially take minutes if the fields joined are different character sets/collations.

Bottom line:

If you don't need to support non-Latin1 languages, want to achieve maximum performance, or already have tables using latin1, choose latin1.

Otherwise, choose UTF8.

like image 158
Ross Smith II Avatar answered Nov 15 '22 17:11

Ross Smith II