Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How prevalent is UTF-8 really?

How wide-spread is the use of UTF-8 for non-English text, on the WWW or otherwise? I'm interested both in statistical data and the situation in specific countries.

I know that ISO-8859-1 (or 15) is firmly entrenched in Germany - but what about languages where you have to use multibyte encodings anyway, like Japan or China? I know that a few years ago, Japan was still using the various JIS encodings almost exclusively.

Given these observations, would it even be true that UTF-8 is the most common multibyte encoding? Or would it be more correct to say that it's basically only used internally in new applications that specifically target an international market and/or have to work with multi-language texts? Is it acceptable nowadays to have an app that ONLY uses UTF-8 in its output, or would each national market expect output files to be in a different legacy encoding in order to be usable by other apps.

Edit: I am NOT asking whether or why UTF-8 is useful or how it works. I know all that. I am asking whether it is actually being adopted widely and replacing older encodings.

like image 607
Michael Borgwardt Avatar asked Jun 26 '09 14:06

Michael Borgwardt


2 Answers

We use UTF-8 in our service-oriented web-service world almost exclusively - even with "just" Western European languages, there are a enough "quirks" to using various ISO-8859-X formats to make our heads spin - UTF-8 really just totally solves that.

So I'd put in a BIG vote for use of UTF-8 everywhere and all the time ! :-) I guess in a service-oriented world and in .NET and Java environments, that's really not an issue or a potential problem anymore.

It just solves so many problems that you really don't need to have to deal with all the time......

Marc

like image 121
marc_s Avatar answered Oct 14 '22 23:10

marc_s


As of 11 April 2021 UTF-8 is used on 96.7% of websites.

like image 44
dan04 Avatar answered Oct 15 '22 00:10

dan04