Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a machine-readable version of HTML5 specs?

Tags:

html

I'm looking for a machine-readable version of the HTML5 specs, akin to a DTD, although any format would do as long as it's parsable.

The HTML5 specs don't seem to contain anything of the sort so my first idea was to look into validators. I dug into the sources of the validator.nu validator but it seems that the schema they use is build by parsing the specs (e.g. parsing its HTML and its english text) and I'll have to build the validator to generate it.

More specifically, I'm looking for a list of elements, their content models, and a list of their attributes with their type and whether they are required or they have a default value.

Finally, I should mention that I'm not looking for validating specific documents. I would use W3C's validator, or validator.nu directly. I'm looking for the specs so that I can use them in my own applications.

like image 300
Josh Davis Avatar asked Jul 04 '11 22:07

Josh Davis


People also ask

What is the HTML5 standard?

The term HTML5 is essentially a buzzword that refers to a set of modern web technologies. This includes the HTML Living Standard, along with JavaScript APIs to enhance storage, multimedia, and hardware access. You may sometimes hear about "new HTML5 elements", or find HTML5 described as a new version of HTML.

Why is HTML5 so popular nowadays?

Why is HTML5 so important? HTML5 is one element that will allow you to do almost everything you want online without needing extra software. Not only that, HTML5 is also free, works across all devices and all modern browsers support it.


2 Answers

UPDATING

Since 2014-10-28 the HTML5 is a recommendation (!)... But this question is not obsolete (the validators now are more complex tham simple DTD).

ANSWER

there are no simple parser, as @ruediste clues show... Today, perhaps the best parser is at https://validator.nu/ ... so,

  1. You show the first part of the answer: it is a complex parser, and validator.nu is a good parser.
  2. the 2014-10-28 W3C's recommendation confirms that there are no simple parser (like a DTD or a list of elements) to say "this is a valid HTML5".
  3. ... this other question show that, perhaps, only context (use/community) can validate the list of tags and attributes.
like image 179
3 revs Avatar answered Oct 28 '22 21:10

3 revs


Trawling through W3's site I can only see two things of interest on this:

  • "As HTML5 is no longer formally based upon SGML, the DOCTYPE no longer serves this purpose, and thus no longer needs to refer to a DTD." from the HTML5 working draft. It doesn't say there isn't one, just that clients don't need one
  • And that HTML5 is still a working draft obviously, not a specification, which implies there may be a DTD published later

I've looked as hard as you probably have with nothing concrete. I think validator.nu's approach is the best as the working draft is likely to change several times before a specification is ever agreed upon. If someone did publish an unofficial DTD it would need constant maintenance.

+1 great question, I wish I could find a concrete answer. I hope someone else can!

like image 42
Tak Avatar answered Oct 28 '22 19:10

Tak