Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What problem does XHTML strict solve?

Tags:

I really don't understand the fascination with XHTML strict. Inline JavaScript typically requires a rats nest of escapes to make it compatible with XHTML and semi-backwards compatible with MSIE 5 & 6. Then there is the issue of not being OCD enough on user input to make sure you don't miss any illegal characters. It just seems like more effort then its worth. Nevermind that almost every developer I've worked along side of keeps forgetting to ensure the content-type returned from the server is reset for XHTML pages from text/html to application/xhtml+xml.

Wish I knew the name of the blogger, but someone else pointed out that a majority of supposedly XHTML compliant websites and open source packages are actually not because of that last issue, forgetting to set the content-type header correctly.

I'm looking to understand why XHTML is useful, or build enough of an arsenal of arguments to prevent it ever being used in future projects that I have influence on.

like image 661
David Avatar asked Nov 10 '08 18:11

David


People also ask

What is XHTML strict?

Valid XHTML Strict This means that the XML of your page follows the particular rules from the XHTML-1.0-Strict DTD. For example, the first tag in the file must be <html>, a <form> tag must have an action="" attribute, an <li> can only appear inside an <ol> or <ul>, you cannot use <frame> tags, and so on.

Why XHTML is stricter than HTML?

XHTML was developed to make HTML more extensible and flexible to work with other data formats (such as XML). In addition, browsers ignore errors in HTML pages, and try to display the website even if it has some errors in the markup. So XHTML comes with a much stricter error handling.

What are the applications of XHTML?

XHTML documents can utilize applications such as scripts and applets that rely upon either the HTML Document Object Model or the XML Document Object Model. XHTML gives you a more consistent, well-structured format so that your webpages can be easily parsed and processed by present and future web browsers.

What may be a problem with serving pages as application XHTML XML?

Are there any problems with serving pages as application/xhtml+xml? It will more than likely mess up the page for anyone still using older versions of IE. When a browser reads XML it uses an XML parser, not an HTML parser.


2 Answers

XHTML1 vs HTML4 and Strict vs Transitional are completely orthogonal issues.

XML might not give any huge advantage to browsers today, but on the server end it's an order of magnitude easier to process documents using XML than trying to parse the mess that is old-school-SGML-except-not-really HTML4.

Restricting yourself to [X]HTML Strict doesn't achieve anything in itself, other than simply that it discourages the use of old, less-maintainable techniques you shouldn't be using anyway.

Inline javascript typically requires a rats nest of escapes to make it compatible with XHTML

You can get away without any escapes as long as you don't use the characters < or &. And ‘// < [CDATA[’ isn't really much worse than ‘< !--’ was in the old days.

In any case, keeping the scripting external is much more manageable; you don't want to be doing anything significant inline.

Then there is the issue of not being OCD enough on user input to make sure you don't miss any illegal characters.

Out-of-band characters are exactly as invalid in HTML4 Transitional as in XHTML1 Strict.

If you're accepting user-submitted HTML and not checking/escaping it with enough of a fine tooth comb to prevent well-formedness errors you have much bigger problems than just complying with a doctype. You'll be letting injection hacks through and making your site vulnerable to cross-site-scripting security holes.

forgetting to ensure the content-type returned from the server is reset for XHTML pages from text/html to application/html+xml.

It's not ‘forgetting’, it's deliberate: there is not really that much point in serving application/xhtml+xml today. To account for IE you have to sniff UA, and then make sure you understand the CSS and JavaScript differences that pop up in both parsing modes... you can do it to prove your technical prowess, but it doesn't really get you anything.

Serving XHTML as legacy HTML may not be ideal, but it lets you keep the simpler, more processable syntax of XML (and potential interoperability with other XML languages like SVG) whilst still being browser-friendly.

People complain about the pickiness of the well-formedness errors, but having those errors picked up straight away for you to fix them is way better than leaving them there silently, ready to trip up some future browser.

like image 188
bobince Avatar answered Sep 29 '22 12:09

bobince


there is a great post about the usage of XHTML @ Beware of XHTML.

Hope it helps, Bruno Figueiredo

like image 32
Bruno Shine Avatar answered Sep 29 '22 10:09

Bruno Shine