What should be used and when? Or is it always better to use UTF-8? Or ISO-8859-1 still has importance in specific conditions?
Is the haracter set related to geographic region?
Is there a benefit to using the code @charset "utf-8";
?
Or like this <link type="text/css; charset=utf-8" rel="stylesheet" href=".." />
at the top of the CSS file?
I found for this
If Dreamweaver adds the tag when you add embedded style to the document, that is a bug in Dreamweaver. From the W3C FAQ:
"For style declarations embedded in a document, @charset rules are not needed and must not be used."
The charset specification is a part of CSS since version 2.0 (may 1998), so if you have a charset specification in a CSS file and Safari can't handle it, that's a bug in Safari.
And add accept-charset in the form:
<form action="/action" method="post" accept-charset="utf-8">
And what should be used if I use the XHTML doctype?
<?xml version="1.0" encoding="UTF-8"?>
or
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Most libraries that don't hold a lot of foreign language materials will be perfectly fine with ISO8859-1 ( also called Latin-1 or extended ASCII) encoding format, but if you do have a lot of foreign language materials you should choose UTF-8 since that provides access to a lot more foreign characters.
As of August 2022, 1.3% of all (but only 8 of the top 1000) websites use ISO/IEC 8859-1. It is the most declared single-byte character encoding in the world on the web, but as web browsers interpret it as the superset Windows-1252 the documents may include characters from that set.
Going backwards from UTF-8 to ISO-8859-1 will cause "replacement characters" (�) to appear in your text when unsupported characters are found. byte[] utf8 = ... byte[] latin1 = new String(utf8, "UTF-8"). getBytes("ISO-8859-1"); You can exercise more control by using the lower-level Charset APIs.
UTF-8 is the dominant encoding for the World Wide Web (and internet technologies), accounting for 98% of all web pages, and up to 100.0% for some languages, as of 2022.
Unicode is taking over and has already surpassed all others. I suggest you hop on the train right now.
Note that there are several flavors of unicode. Joel Spolsky gives an overview.
(Graph current as of Feb. 2012, see comment below for more exact values.)
UTF-8 is supported everywhere on the web. Only in specific applications is it not. You should always use UTF-8 if you can.
The downside is that for languages such as Chinese, UTF-8 takes more space than, say, UTF-16. But if you don't plan on going Chinese, or even if you do go Chinese then UTF-8 is fine.
The only cons against using UTF-8 is that it takes more space compared to various encodings, but compared to western languages it takes almost no extra space at all, except for very special characters, and those extra bytes you can live with. We are in 2009 after all. ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With