Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to display all non-English characters correctly in a web site?

It's annoying to see even the most professional sites do it wrong. Posted text turns into something that's unreadable. I don't have much information about encodings. I just want to know about the problem that's making such a basic thing so hard.

  • Does HTTP encoding limit some characters?
  • Do users need to send info about the charset/encoding they are using?
  • Assuming everything arrives to the server as it is, is encoding used saving that text causing the problem?
  • Is it something about browser implementations?
  • Do we need some JavaScript tricks to make it work?

Is there an absolute solution to this? It may have its limits but StackOverflow seems to make it work.

like image 583
Ufuk Hacıoğulları Avatar asked Apr 19 '11 19:04

Ufuk Hacıoğulları


1 Answers

I suspect one needs to make sure that the whole stack handles the encoding with care:

  • Specify a web page font (CSS) that supports a wide range of international characters.
  • Specify a correct lang/charset HTML tag attributes and make sure that the Browser is using the correct encoding.
  • Make sure the HTTP requests are send with the appropriate charset specified in the headers.
  • Make sure the content of the HTTP requests is decoded properly in your web request handler
  • Configure your database/datastore with a internationalization-friendly encoding/Collation (such as UTF-9/UTF-16) and not one that just supports latin characters (default in some DBs).

The first few are normally handled by the browser and web framework of choice, but if you screw up the DB encoding or use a font with limited character set there will be no one to save you.

like image 102
Ivan Zlatev Avatar answered Nov 15 '22 10:11

Ivan Zlatev