Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will HTML 5 validation be worth the candle?

Tags:

html

It's widely considered that the best reason to validate one's HTML is to ensure that all browsers will treat it consistently and predictably.

The HTML 5 draft, however, contains two specifications in one. First an author spec, describing the elements and attributes that HTML authors should use, and their interrelationships. Validation of an HTML 5 page is based on this spec. The elements and attributes included are not directly drawn from HTML 4, but have needed to be justified from first principles, which means that some HTML 4 features, such as the summary attribute on <table>, longdesc on <img> and the profile attribute on <head>, do not currently appear in this draft. Such features are not considered deprecated, they are simply not included. (Their absence from the draft remains a matter of dispute, although their inclusion any time soon does not seem likely.)

Second, the draft defines a browser processing specification that seeks to define exactly how a browser's parser will treat any byte stream it's given, regardless of how well formed and valid the HTML. This means that when the browsers fully support HTML 5, it will be possible to predict how any browser will treat HTML for a much wider range of inputs than merely those that pass validation.

In particular, because HTML 5 is defined to be 100% backward compatible with today's web, all valid HTML 4, and all invalid but commonly used mark-up, will continue to be processed exactly the same as it is today, regardless of whether it is HTML 5 valid or not.

Therefore, at the very minimum, anyone using any feature from HTML 5, HTML 4, or any previous version of HTML, plus many proprietary extensions, can be confident that their HTML will get consistent and predictable treatment across all browsers.

Given this, does it make any sense to limit ones HTML 5 to that which will validate, and what practical benefit will we get from doing so?

like image 680
Alohci Avatar asked Jan 11 '09 13:01

Alohci


People also ask

Does HTML validation matter?

Google Says Valid HTML Doesn't Matter “Although we do recommend using valid HTML, it's not likely to be a factor in how Google crawls and indexes your site.”

How important is HTML validation?

Validation improves usability and functionality because your users are less likely to run into errors when displayed on browsers compared to non-validated websites. Validation is fully compatible with a wide range of dynamic pages, scripting, active content, and multimedia presentations.

Does HTML validation affect SEO?

It appears that validation is a non-factor in SEO, as long as the page renders correctly in the browser.


1 Answers

  • First there’s the layer of validity corresponding to “parse errors” in the HTML5 parsing algorithm. This layer is similar to XML well-formedness. The foremost reason to avoid having errors in your documents on this layer is that you may get a surprising parse tree. If your document is error-free on this layer, you get fewer suprises to debug when writing JS or CSS that works with the DOM.
  • As a special case of the above-mentioned layer, there’s the HTML5 doctype: <!DOCTYPE html>. The reason why one would want to comply here is getting the standards mode in the easiest way possible. It’s something you can memorize unlike the HTML 4.01 or XHTML 1.0 doctypes you need to look up and copy and paste each time. Of course, the reason why you’d want the standards mode is fewer surprises on the CSS layer.
  • The main reason to care about validation on the layer higher than the parsing algorithm is catching your typos so that you spend less time debugging why your page isn’t working like you are expecting.
  • The previous point does not explain why you should care about validation when a given element or attribute that you did not misspell is supported by browsers as a matter of legacy but the HTML5 spec still shuns it. Here’s why HTML5 has obsoleted syntax like this:
    • HTML5 uses obsoletion to signal to authors that some features are a waste of their time. These include longdesc, summary and profile. (Note that people disagree on whether these are, indeed, waste of time, but as currently drafted, HTML5 makes them obsolete.) That is, if you have limited resources to improve accessibility, your limited resources are better spent on something other than longdesc and summary. If you have limited resources for semantic purity, your resources are better spent on something other than making sure you have the right incantation in profile.
    • HTML5 obsoletes some presentational features that can be duplicated in CSS to guide authors to use CSS for their own good. This way, authors who don’t consider maintainability on their own are supposed to be guided to more maintainable code nonetheless. Personally, I’d prefer making more of the legacy presentational stuff conforming and leaving it to authors themselves to decide which way of doing things works for them.
    • Some things are obsoleted for political reasons. The <font> element is obsoleted, because making it conforming would make anti-<font> standardistas think that the HTML5 people have gone crazy, which could lead to bad PR. <applet> is obsoleted mainly as a matter of principle of not giving special markup to one particular plug-in. The classid attribute on <object> is obsoleted, because it’s in practice ActiveX-specific.
    • Some things are obsoleted on the basis of language design aesthetics. This includes the name attribute on <a> and the language attribute on <script>.

(I develop the Validator.nu HTML5 validator which is also the HTML5 validation engine used by the W3C validator.)

like image 132
hsivonen Avatar answered Oct 06 '22 01:10

hsivonen