I've recently noticed a lot of high profile sites using characters directly in their source, eg:
<q>“Hi there”</q>
Rather than:
<q>“Hi there”</q>
Which of these is preferred? I've always used entities in the past, but using the character directly seems more readable, and would seem to be OK in a Unicode document.
You use entities to help the parser distinguish when a character should be represented as HTML, and what you really want to show the user, as HTML will reserve a special set of characters for itself.
The most common character entity in HTML is the non-breaking space. Normally HTML will truncate spaces in your text. If you write 10 spaces in your text HTML will remove 9 of them. To add spaces to your text, use the character entity.
Google's HTML/CSS Style Guide advises against using entity references: Do not use entity references. There is no need to use entity references like — , ” , or ☺ , assuming the same encoding (UTF-8) is used for files and editors as well as among teams.
Frequently Used HTML Character Entities. You can use numeric character references, instead of entity names. A key benefit of using numeric character references is that, they have stronger browser support, and can be used to specify any Unicode character, whereas entities are limited to a subset of this. Note: Entities names are case sensitive!
Note: HTML entities names are case-sensitive! Please check out the HTML character entities reference for a complete list of character entities of special characters and symbols. Tip: Nonbreaking space ( ) is used to create a blank space between two items that cannot be separated by a line break.
Frequently Used HTML Character Entities. You can use numeric character references, instead of entity names. A key benefit of using numeric character references is that, they have stronger browser support, and can be used to specify any Unicode character, whereas entities are limited to a subset of this. Note: HTML entities names are case-sensitive!
A key benefit of using numeric character references is that, they have stronger browser support, and can be used to specify any Unicode character, whereas entities are limited to a subset of this. Note: HTML entities names are case-sensitive!
If the encoding is UTF-8, the normal characters will work fine, and there is no reason not to use them. Browsers that don't support UTF-8 will have lots of other issues while displaying a modern webpage, so don't worry about that.
So it is easier and more readable to use the characters and I would prefer to do so.
It also saves a couple of bytes which is good, although there is much more to gain by using compression and minification.
The main advantage I can see with encoding characters is that they'll look right, even if the page is interpreted as ASCII.
For example, if your page is just a raw HTML file, the default settings on some servers would be to serve it as text/html; charset=ISO-8859-1
(the default in HTTP 1.1). Even if you set the meta tag for content-type, the HTTP header has higher priority.
Whether this matters depends on how likely the page is to be served by a misconfigured server.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With