Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is HTML5 valid XML?

Tags:

html

xhtml

I am confused. A co-worker turned me on to the possibility that tags ending in />, such as <br /> can still be used in HTML5. I thought that only <br>-style could be used. All of the "talk" across the Internet is about using the latter.

Could someone please explain this to me? This seems very confusing and poorly documented.

And this brings up another question: Is HTML 5 considered to be well-formed XML?

like image 553
Jack B. Avatar asked Apr 05 '11 21:04

Jack B.


People also ask

Does HTML5 support XML?

HTML 5 can be written in html and XML. HTML 5 specification is the description of a vocabulary that you can write in two different syntaxes (html and XML) depending on your developer needs, markets and applications.

Why is HTML5 not XML?

There is an XML serialization called XHTML5, but for backwards-compatibility purposes with IE browsers, it is not recommended to be used. So technically, HTML5 is not considered to be well-formed XML. Polyglot is no longer maintained and not good standard (Beware.

Is HTML technically XML?

HTML and XML are related to each other, where HTML displays data and describes the structure of a webpage, whereas XML stores and transfers data. HTML is a simple predefined language, while XML is a standard language that defines other languages.

Is HTML5 a XHTML?

As you already know, XHTML and HTML are very similar, meaning that most differences between HTML and HTML5 also apply to HTML5 and XHTML. However, there is some further variation between the two: XHTML is case sensitive (same as HTML), whereas HTML5 is not. Both XHTML and HTML have a more complex doctype than HTML5.


2 Answers

No. Counter-examples:

These are valid HTML5 but invalid XHTML5:

  1. Some closing tags can be omitted:

    <p>First <p>Second 

    See: P-end-tag (</p>) is not needed in HTML

  2. script escape magic:

    <script><a></script> 

    See: What is CDATA in HTML?

  3. Attributes without values (boolean attributes):

    <input type="text" disabled /> 

    See: Correct value for disabled attribute

  4. Attributes without quotes, e.g.:

    <div data-a=b></div> 

    See: In XHTML 1.0 Strict do attribute values need to be surrounded with quotes?

  5. Implicit open elements and multiple top level elements.

    Some HTML elements are created implicitly. E.g. html. This allows the HTML to have "multiple top level elements":

    <!doctype html><title>a</title><p>a</p> 

    See: Is it necessary to write HEAD, BODY and HTML tags?

Valid XHTML that is invalid HTML:

  1. CDATA constructs with invalid tags inside

  2. ENTITY and other exclamation mark constructs, which allow for billion laughs: How does the billion laughs XML DoS attack work?

Valid HTML and XHTML but with different meanings:

  1. HTML has hundreds of named character references (e.g. &pound;, &copy;), XML has only 5 (quot, amp, apos, lt, gt).

There is an XML serialization of it, called XHTML5. Basically, you're free to use either HTML5 (HTML serialization) or XHTML5 (XML serialization). The draft spec says HTML5 "is the format suggested for most authors," mainly for the same reasons people recommend text/html for XHTML 1.1.

like image 37
Matthew Flaschen Avatar answered Oct 25 '22 12:10

Matthew Flaschen