For globalization of scripts, it is very common to use UTF-8
as the default charset; for example in HTML or default charset of mysql. This is also the case for latin website in which characters are in the class of ISO-8859-1
. Isn't it advantageous to use ISO-8859-1
when UTF-8
characters are not needed. From advantageous, I mean critically beneficial.
My point is that only 0 - 127 characters of UTF-8
are 1 byte, and from 128 - 255 are 2-byte; where ISO-8859-1
is 1 byte system. Doesn't it play a critical role in database storage?
If everything you need now and forever is ISO-8859-1, you'll save space by using it, though likely not much if most of the characters used are < 128. If you ever need to use anything outside of ISO-8859-1, you'll be in a world of hurt. From an overall perspective, the cost in storage for UTF-8 is way lower than the cost of implementing multiple encodings.
Most of these 127 UTF-8
1-byte characters are the most used when you work with ISO-8859-1
. Let's have a look here. If you use UTF-8
you will need 1 extra byte only when you use one of the 127-255 characters (not so commons I bet).
My opinion? Use UTF-8
if you can and if you haven't problem handling it. The time you save the day you will need some extra characters (or the day you have to translate your content) really worth a few extra bytes here and there in the DB...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With