I know that MySQL has default of latin1 encoding and apparently it takes 1 byte to store a character in latin1 and 3 bytes to store a character in utf-8 - is that correct?
I am working on a site that I hope will be used globally. Do I absolutely need to have utf-8? Or will I be able to get away with using latin1?
Also, I tried to change some tables from latin1 to utf8 but I got this error:
Speficief key was too long; max key length is 1000 bytes 
Does anyone know the solution to this?  And should I really solve that or may latin1 be enough?
Thanks, Alex
what is the difference between utf8 and latin1? They are different encodings (with some characters mapped to common byte sequences, e.g. the ASCII characters and many accented letters). UTF-8 is one encoding of Unicode with all its codepoints; Latin1 encodes less than 256 characters.
The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.
Why use UTF-8? An HTML page can only be in one encoding. You cannot encode different parts of a document in different encodings. A Unicode-based encoding such as UTF-8 can support many languages and can accommodate pages and forms in any mixture of those languages.
utf-8 can store only 1, 2 or 3 bytes characters, while utf8mb4 can store 4 bytes characters as well. utf-8 is a subset of characters given by utf8mb4 .
it takes 1 byte to store a character in latin1 and 3 bytes to store a character in utf-8 - is that correct?
It takes 1 bytes to store a latin1 character and 1 to 3 bytes to store a UTF8 character.
If you only use basic latin characters and punctuation in your strings (0 to 128 in Unicode), both charsets will occupy the same length.
Also, I tried to change some tables from latin1 to utf8 but I got this error: "Speficief key was too long; max key length is 1000 bytes" Does anyone know the solution to this? And should I really solve that or may latin1 be enough?
If you have a column of VARCHAR(334) or longer, MyISAM wont't let you create an index on it since there is remote possibility of the column to occupy more that 1000 bytes.
Note that keys of such length are rarely useful. You can create a prefixed index which will be almost as selective for any real-world data.
At a bare minimum I would suggest using UTF-8. Your data will be compatible with every other database out there nowadays since 90%+ of them are UTF-8.
If you go with LATIN1/ISO-8859-1 you risk the data being not properly stored because it doesn't support international characters... so you might run into something like the left side of this image:

If you go with UTF-8, you don't need to deal with these headaches.
Regarding your error, it sounds like you need to optimize your database. Consider this: http://bugs.mysql.com/bug.php?id=4541#c284415
It would help if you gave specifics on your table schema and column for that issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With