For some reason, the plain text character –
on the html side is being dsiplayed as –
. The only thing I can think that would be attributed to this is the character encoding. My guess is that it's utf-8, but not sure how I am getting the weird characters. Is there an explanation?
What I mean by default is if the charset
isn't specified.
That certainly looks like UTF-8 being interpreted as something else.
HTML doesn't have a default. It's picked up from the headers of the transfer protocol (normally HTTP) or failing that, from a BOM, from meta
elements or, in the case of XHTML, the XML declaration. In the absence of any of those, the user-agent guesses.
HTTP has a default of ISO-8859-1, which even one HTML spec described as having "proved useless" [source] (they don't even go into the fact that a large amount of stuff out there labelled as ISO-8859-1 is actually CP-1252).
Hence. Forget about defaults, always set your HTTP headers and your meta elements (in case it's saved as a file).
And always do so as UTF-8. Anything else in this day and age is just an act of masochism.
The !DOCTYPE
doesn't set a character encoding, the meta
element together with the (newly standardized) charset
attribute does. If it's absent I'm not entirely sure how the browser determines the encoding.
I believe the problem you're having though is that your page is saved in one encoding and served in another.
Just make sure you set <meta charset="utf8"/>
and make sure your document is in fact utf8 and it should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With