I'm building a website where I have to work with less then perfect masterdata (I guess I'm not the only one :-)) In my case I have to render an xml filte to html (using xsl). Sometimes the masterdata is using html-enitites allready (eg ;<code>&eacute;</code> in french words) so there I have to use 'disable-output-escaping='yes') there in order to avoid double encoding. The easiest solution is disable output escaping all together, so I never run the risk of a double encoding. The only characters that misses encoding for this masterdata are the ampersands. But when I parse them 'raw' (so rather & than <code>&amp;</code> all browsers seem to be ok with it. So the question : what are the consequenses of using not encoded ampersands in html?

AFAIK bare ampersands are illegal in HTML. With that out of the way, let's look at the consequences: <ul> <li>You are now relying on the browser's capabilities to detect and gracefully recover from the problem. Note that in order to do this, the browser has to guess: <code>& </code> is "clearly" an ampersand followed by a space, and <code>&copy;</code> is clearly the copyright symbol. But what about the text fragment <code>edit&copy</code>? The browser I 'm using right now mangles it.</li> <li>If you are using XHTML, or if the content is ever going to be inserted into an XML document, the result will be a hard parser error.</li> </ul> Since it's more difficult to detect and account for these cases manually than it is to replace all ampersands that are not part of entities (say with a regex), you should really do the latter.

Can I use unencoded ampersands (&) in html? [duplicate]

Tags:

html

html-entities

ampersand

I'm building a website where I have to work with less then perfect masterdata (I guess I'm not the only one :-))

In my case I have to render an xml filte to html (using xsl). Sometimes the masterdata is using html-enitites allready (eg ;é in french words) so there I have to use 'disable-output-escaping='yes') there in order to avoid double encoding.

The easiest solution is disable output escaping all together, so I never run the risk of a double encoding.

The only characters that misses encoding for this masterdata are the ampersands. But when I parse them 'raw' (so rather & than & all browsers seem to be ok with it.

So the question : what are the consequenses of using not encoded ampersands in html?

965

asked Jun 27 '12 07:06

Peter

1 Answers

AFAIK bare ampersands are illegal in HTML. With that out of the way, let's look at the consequences:

You are now relying on the browser's capabilities to detect and gracefully recover from the problem. Note that in order to do this, the browser has to guess: & is "clearly" an ampersand followed by a space, and © is clearly the copyright symbol. But what about the text fragment edit&copy? The browser I 'm using right now mangles it.
If you are using XHTML, or if the content is ever going to be inserted into an XML document, the result will be a hard parser error.

Since it's more difficult to detect and account for these cases manually than it is to replace all ampersands that are not part of entities (say with a regex), you should really do the latter.

answered Sep 19 '22 05:09

Jon

Related questions
                            
                                Close iframe cross domain
                            
                                Div on top of Div using z-index
                            
                                XHTML Strict: br tag inside p tag
                            
                                How to convert from HTML to UTF-8 in java
                            
                                Combine CSS Attribute Selectors
                            
                                HTML5 / JS storage event handler
                            
                                Checkbox Stays Checked on Page Refresh
                            
                                Is there a function that converts HTML to plaintext?
                            
                                Inconsistent textarea handling in browsers
                            
                                Line thickness in a canvas element
                            
                                'WSGIRequest' object is not subscriptable
                            
                                Difference between touches and targetTouches
                            
                                How to call a function on video end ? (HTML5 and mediaelementjs)
                            
                                Javascript Customize Confirm with "Yes" or "No" [duplicate]
                            
                                Delete html tags in sed or similar
                            
                                Pause mediaelement.js using jquery
                            
                                Override row color on a single row inside a table with alternating row colors
                            
                                why is my content showing outside the div?
                            
                                Passing a Javascript function through inline data- attributes
                            
                                To track clicks on a div using google analytics

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With