What is the default character encoding for HTML?

Question

For some reason, the plain text character – on the html side is being dsiplayed as â€“. The only thing I can think that would be attributed to this is the character encoding. My guess is that it's utf-8, but not sure how I am getting the weird characters. Is there an explanation?

What I mean by default is if the charset isn't specified.

Jon Hanna · Accepted Answer

That certainly looks like UTF-8 being interpreted as something else.

HTML doesn't have a default. It's picked up from the headers of the transfer protocol (normally HTTP) or failing that, from a BOM, from meta elements or, in the case of XHTML, the XML declaration. In the absence of any of those, the user-agent guesses.

HTTP has a default of ISO-8859-1, which even one HTML spec described as having "proved useless" [source] (they don't even go into the fact that a large amount of stuff out there labelled as ISO-8859-1 is actually CP-1252).

Hence. Forget about defaults, always set your HTTP headers and your meta elements (in case it's saved as a file).

And always do so as UTF-8. Anything else in this day and age is just an act of masochism.

powerbuoy · Answer

The !DOCTYPE doesn't set a character encoding, the meta element together with the (newly standardized) charset attribute does. If it's absent I'm not entirely sure how the browser determines the encoding.

I believe the problem you're having though is that your page is saved in one encoding and served in another.

Just make sure you set <meta charset="utf8"/> and make sure your document is in fact utf8 and it should work.

What is the default character encoding for HTML?

Tags:

html

character-encoding

Chad Harrison

2 Answers

Jon Hanna

powerbuoy

Recent Activity

Donate For Us

What is the default character encoding for HTML?

Tags:

html

character-encoding

Chad Harrison

2 Answers

Jon Hanna

powerbuoy

Related questions

Recent Activity

Donate For Us