Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the "charset" meta tag required with HTML5?

The W3C "HTML5 differences from HTML4" working draft states:

For the HTML syntax, authors are required to declare the character encoding.

What does "required" mean?

Obviously, a browser will still render HTML5 without the charset meta tag. If no encoding is specified, which encoding will a browser use?

Basically, I want to know if it is actually necessary to include <meta charset="">, or if 99% of the time browsers will use the correct encoding anyway.

like image 742
twiz Avatar asked Feb 03 '13 03:02

twiz


People also ask

What is meta charset in HTML5?

The charset attribute specifies the character encoding for the HTML document. The HTML5 specification encourages web developers to use the UTF-8 character set, which covers almost all of the characters and symbols in the world!

Do you need charset UTF-8?

It doesn't matter which you use, but it's easier to type the first one. It also doesn't matter whether you type UTF-8 or utf-8 . You should always use the UTF-8 character encoding. (Remember that this means you also need to save your content as UTF-8.)

Should I use meta charset UTF-8?

Furthermore, most browsers use UTF-8 by default if no character encoding is specified. But because that's not guaranteed, it's better to just include a character encoding specification using the <meta> tag in your HTML file. There you have it. 🎉 Feel free to leave any comments or thoughts below.

What is the purpose of meta charset UTF-8?

Simply put, when you declare the "charset" as "UTF-8", you are telling your browser to use the UTF-8 character encoding, which is a method of converting your typed characters into machine-readable code.


2 Answers

It is not necessary to include <meta charset="blah">. As the specification says, the character set may also be specified by the server using the HTTP Content-Type header or by including a Unicode BOM at the beginning of the downloaded file.

Most web servers today will send back a character set in the Content-Type header for HTML text data if none is specified. If the web server doesn't send back a character set with the Content-Type header and the file does not include a BOM and the page does not include a <meta charset="blah"> declaration, the browser will have a default encoding that is usually based on the language settings of the host computer. If this does not match the actual character encoding of the file, then some characters will be displayed improperly.

Will browsers use the proper encoding 99% of the time? If your page is UTF-8, probably. If not, probably not.

The W3C provides a document outlining the precendence rules for the three methods that says the order is HTTP header, BOM, followed by in-document specification (meta tag).

like image 185
hrunting Avatar answered Sep 22 '22 19:09

hrunting


According to the Google PageSpeed browser extension, declaring a charset in a meta element "disables IE8's lookahead feature" which apparently forces it to download everything in serial.

My understanding was that <meta charset-"utf-8"> was required for valid HTML5, but that is why I started browsing here.

That draft of the spec seems pretty clear to me and since I add the HTTP header via .htaccess, I am going to start leaving it out...even though I'm tempted not to, just make IE8 users suffer a bit more.

Thanks.

@Jules Mazur do you have any references about those points? Most of what I do is SEO and accessibility is important to me and if that is the case I am more than receptive to leaving the the meta declaration.

like image 41
adam-asdf Avatar answered Sep 19 '22 19:09

adam-asdf