Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it significantly better to use ISO-8859-1 rather than UTF-8 wherever possible?

For globalization of scripts, it is very common to use UTF-8 as the default charset; for example in HTML or default charset of mysql. This is also the case for latin website in which characters are in the class of ISO-8859-1. Isn't it advantageous to use ISO-8859-1 when UTF-8 characters are not needed. From advantageous, I mean critically beneficial.

My point is that only 0 - 127 characters of UTF-8 are 1 byte, and from 128 - 255 are 2-byte; where ISO-8859-1 is 1 byte system. Doesn't it play a critical role in database storage?

like image 976
Googlebot Avatar asked Dec 12 '22 07:12

Googlebot


2 Answers

If everything you need now and forever is ISO-8859-1, you'll save space by using it, though likely not much if most of the characters used are < 128. If you ever need to use anything outside of ISO-8859-1, you'll be in a world of hurt. From an overall perspective, the cost in storage for UTF-8 is way lower than the cost of implementing multiple encodings.

like image 176
smparkes Avatar answered Dec 19 '22 09:12

smparkes


Most of these 127 UTF-8 1-byte characters are the most used when you work with ISO-8859-1. Let's have a look here. If you use UTF-8 you will need 1 extra byte only when you use one of the 127-255 characters (not so commons I bet).

My opinion? Use UTF-8 if you can and if you haven't problem handling it. The time you save the day you will need some extra characters (or the day you have to translate your content) really worth a few extra bytes here and there in the DB...

like image 23
lorenzo-s Avatar answered Dec 19 '22 11:12

lorenzo-s