Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens when we don't specify <meta charset="utf-8">?

Tags:

html

What happens when we don't specify <meta charset="utf-8"> in the HEAD of the HTML document?

like image 966
Pankaj Parashar Avatar asked May 12 '13 08:05

Pankaj Parashar


People also ask

What happens if you exclude meta charset UTF-8?

In many cases, the meta tag is ignored, so omitting it has no effect, except perhaps in situations where the HTML document is saved locally (so that HTTP headers are lost). In many other cases, it is not ignored, but if it is omitted, browsers will infer the correct encoding anyway.

Is meta charset UTF-8 needed?

It doesn't matter which you use, but it's easier to type the first one. It also doesn't matter whether you type UTF-8 or utf-8 . You should always use the UTF-8 character encoding. (Remember that this means you also need to save your content as UTF-8.)

What is the purpose of meta charset UTF-8?

Unfortunately, ASCII only encodes English characters, so if you used other languages whose alphabet does not consist of English characters, the text wouldn't be properly displayed on your screen. Thus, UTF-8 was created to address ASCII's shortcomings and can translate almost every language in the world.

Is meta charset important?

It is not necessary to include <meta charset="blah"> . As the specification says, the character set may also be specified by the server using the HTTP Content-Type header or by including a Unicode BOM at the beginning of the downloaded file.


2 Answers

Whether such a meta tag is present or not, browsers and user agents will first look at the HTTP headers to find encoding information there. Actually, they will even before that honor user settings and do BOM sniffing, as described in section 8.2.2.1 Determining the character encoding in HTML5 CR – which is in this issue a description of the reality rather than just proposed norm.

So the answer is really “it depends”. In many cases, the meta tag is ignored, so omitting it has no effect, except perhaps in situations where the HTML document is saved locally (so that HTTP headers are lost). In many other cases, it is not ignored, but if it is omitted, browsers will infer the correct encoding anyway. And in some cases, where the tag happens to be the only thing that makes the browser use the right encoding, omitting it will cause wrong interpretation of data, typically so that bytes are interpreted in windows-1252 encoding. What this matters depends on the actual content.

like image 136
Jukka K. Korpela Avatar answered Oct 02 '22 00:10

Jukka K. Korpela


What happens when we don't specify <meta charset="utf-8"> ? in the HEAD of the HTML document?

The user agent looks for the Content-Type response HTTP header sent from the server:

Content-Type: text/html; charset=utf-8 

And if the Content-Type header doesn't specify a charset the depending on the User Agent different things might happen. Some user agents might try to use heuristics to guess the correct charset by analyzing some of the bytes from the response stream looking for known encodings. And if this fails you might end up with a couple of question marks or weird symbols in your web page at the place where you used characters outside of the ASCII range.

like image 28
Darin Dimitrov Avatar answered Oct 02 '22 00:10

Darin Dimitrov