Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should an end tag close all unclosed intervening start tags with omitted end tags?

Tags:

html

html4

sgml

Am I reading the HTML 4.01 standard wrong, or is Google? In HTML 4.01, if I write:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html> <head> <body>plain <em>+em <strong>+strong </em>-em

The rendering in Google Chrome is:

plain +em +strong -em

This seems to contradict the HTML 4.01 standard, which summarizes the underlying SGML rules as: “an end tag closes, back to the matching start tag, all unclosed intervening start tags with omitted end tags”.¹

That is, the </em> end tag should close not only the <em> start tag but also the unclosed intervening <strong> start tag, and the rendering should be:

plain +em +strong -em

A commenter pointed out that it is bad practice to leave tags open, but this is only an academic example. An equally good example would be: <em> +em <strong> +strong </em> -em </strong>. It was my understanding from the HTML 4.01 standard that this code fragment would not work as intended because of the overlapping elements: the </em> end tag should implicitly close the <strong>. The fact that it did work as intended was surprising, and this is what led to my question.

And it turned out I proposed a false dichotomy in the question: neither Google nor I were reading the HTML 4.01 standard wrong. A private correspondent at w3.org pointed me to Web SGML and HTML 4.0 Explained by Martin Bryan, which explains that “[t]he parsing program will automatically close any currently open embedded element which has been declared as having omissible end-tags when it encounters an end-tag for a higher level element. (If an embedded element whose end-tag cannot be omitted is still open, however, the program will report an error in the coding.)”² (Emphasis added.) Bryan’s summarization of the SGML standard is right, and HTML 4.01’s summarization is wrong.

like image 459
MetaEd Avatar asked Jan 06 '12 23:01

MetaEd


People also ask

Which tag does not require a closing tag?

The void elements or singleton tags in HTML don't require a closing tag to be valid. These elements are usually ones that either stand alone on the page ​or where the end of their contents is obvious from the context of the page itself.

What will happen if you forgot to put the closed tag in your documents?

It will take the next tag and think it belongs to the previous tag without the closing tag.

Should I close P tag?

Technically this is optional, but it's good practice to include the closing tag to ensure your document validates. By default, most browsers place a line break and a blank line between paragraphs.

What is the purpose of an ending tag </?

The end tag functions exactly like a right parenthesis or a closing quotation mark or a right curly brace. It contains no data of its own; it simply ends the most recent (innermost) tag with the same name.


1 Answers

The statement quoted from the HTML 4.01 specification is very obscure, or just plain wrong on all accounts. HTML 4.01 has specific rules for end tag omission, and these rules depend on the element. For example, the end tag of a p element may be omitted, the end tag of an em may never be omitted. The statement in the specification probably tries to say that an end tag implicitly closes any inner elements that have not yet been closed, to the extent that end tag omission is allowed.

No browser has ever implement HTML 4.01 (or any earlier HTML specification) as defined, with the SGML features that are formally part of it. Anything that the HTML specifications say about SGML should be taken as just theoretical until proven otherwise.

HTML5 doesn’t change the rules of the game in this respect, except that it writes down the error handling rules. In simple issues like these, the rules just make the traditional browser behavior a norm. They are tagsoup-oriented, treating certain tags more or less as formatting commands: <em> means “italicize,” </em> means “stop italicizing,” etc. But HTML5 also takes measures to define error handling more formally so that despite such tag soup usage, it is well-defined what document tree in the DOM will be constructed.

like image 116
Jukka K. Korpela Avatar answered Sep 25 '22 06:09

Jukka K. Korpela