How wide-spread is the use of UTF-8 for non-English text, on the WWW or otherwise? I'm interested both in statistical data and the situation in specific countries.
I know that ISO-8859-1 (or 15) is firmly entrenched in Germany - but what about languages where you have to use multibyte encodings anyway, like Japan or China? I know that a few years ago, Japan was still using the various JIS encodings almost exclusively.
Given these observations, would it even be true that UTF-8 is the most common multibyte encoding? Or would it be more correct to say that it's basically only used internally in new applications that specifically target an international market and/or have to work with multi-language texts? Is it acceptable nowadays to have an app that ONLY uses UTF-8 in its output, or would each national market expect output files to be in a different legacy encoding in order to be usable by other apps.
Edit: I am NOT asking whether or why UTF-8 is useful or how it works. I know all that. I am asking whether it is actually being adopted widely and replacing older encodings.
We use UTF-8 in our service-oriented web-service world almost exclusively - even with "just" Western European languages, there are a enough "quirks" to using various ISO-8859-X formats to make our heads spin - UTF-8 really just totally solves that.
So I'd put in a BIG vote for use of UTF-8 everywhere and all the time ! :-) I guess in a service-oriented world and in .NET and Java environments, that's really not an issue or a potential problem anymore.
It just solves so many problems that you really don't need to have to deal with all the time......
Marc
As of 11 April 2021 UTF-8 is used on 96.7% of websites.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With