What happens when we don't specify <meta charset="utf-8">
in the HEAD of the HTML document?
In many cases, the meta tag is ignored, so omitting it has no effect, except perhaps in situations where the HTML document is saved locally (so that HTTP headers are lost). In many other cases, it is not ignored, but if it is omitted, browsers will infer the correct encoding anyway.
It doesn't matter which you use, but it's easier to type the first one. It also doesn't matter whether you type UTF-8 or utf-8 . You should always use the UTF-8 character encoding. (Remember that this means you also need to save your content as UTF-8.)
Unfortunately, ASCII only encodes English characters, so if you used other languages whose alphabet does not consist of English characters, the text wouldn't be properly displayed on your screen. Thus, UTF-8 was created to address ASCII's shortcomings and can translate almost every language in the world.
It is not necessary to include <meta charset="blah"> . As the specification says, the character set may also be specified by the server using the HTTP Content-Type header or by including a Unicode BOM at the beginning of the downloaded file.
Whether such a meta
tag is present or not, browsers and user agents will first look at the HTTP headers to find encoding information there. Actually, they will even before that honor user settings and do BOM sniffing, as described in section 8.2.2.1 Determining the character encoding in HTML5 CR – which is in this issue a description of the reality rather than just proposed norm.
So the answer is really “it depends”. In many cases, the meta
tag is ignored, so omitting it has no effect, except perhaps in situations where the HTML document is saved locally (so that HTTP headers are lost). In many other cases, it is not ignored, but if it is omitted, browsers will infer the correct encoding anyway. And in some cases, where the tag happens to be the only thing that makes the browser use the right encoding, omitting it will cause wrong interpretation of data, typically so that bytes are interpreted in windows-1252 encoding. What this matters depends on the actual content.
What happens when we don't specify
<meta charset="utf-8">
? in the HEAD of the HTML document?
The user agent looks for the Content-Type response HTTP header sent from the server:
Content-Type: text/html; charset=utf-8
And if the Content-Type header doesn't specify a charset
the depending on the User Agent different things might happen. Some user agents might try to use heuristics to guess the correct charset by analyzing some of the bytes from the response stream looking for known encodings. And if this fails you might end up with a couple of question marks or weird symbols in your web page at the place where you used characters outside of the ASCII range.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With