Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are unclosed H3 tags breaking this page?

If you take a look at this page in a modern browser (latest stable builds of Chrome, Firefox, or IE), you will see that the text increases in size. Taking a look at the source code, it seems that it is due to unclosed <h3>s in the code.

However, I recall that most browsers autoclose tags whenever given the opportunity to. The following code (same doctype as the broken site) works fine with all tags getting closed:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
  <head></head>
  <body>Hello
    <h3>My
    <h3>Name
    <h3>Is
    <h3>Manish
  </body>
</html>

So unclosed <h3>s may not be (or may be only a part of) the issue.

So, my question is, why aren't the browsers autoclosing the tags there?

like image 342
Manishearth Avatar asked Feb 09 '13 10:02

Manishearth


1 Answers

First and foremost, the h1 to h6 elements have always required both their opening and closing tags in order to validate, even back in HTML 3.2:

H1, H2, H3, H4, H5 and H6 are used for document headings. You always need the start and end tags.

So both the page in the link and your example are invalid.

That said, it's interesting how a browser handles both cases differently (and yes, unclosed <h3> tags are the issue):

In any HTML DOM, the h1 to h6 elements can never be children of one another, similar to how p elements can never be children of each other. Any opening <h1> to <h6> tag that directly follows any such unclosed opening tag will implicitly close it, and only then. Therefore, all the h3 elements in your example are really siblings of one another, and not successive descendants.

What's happening in that page, though, is that the h3 elements aren't siblings of each other at all. Instead, they're all separated by table cells, font elements, and so on. It's quite a mess (although that's probably to be expected of a page authored with Microsoft FrontPage1).

However, while the <tr> and <td> tags have their own closing tags, this does not cause the <h3> tags between them to implicitly close. They're still open! Since none of the <h3> tags are closed and there are intermediate <font> and other tags clashing with the h3 elements, the result is that the h3 elements contain all their following ones as descendants, but not directly as children, in spite of the <tr> and <td> elements:

h3
  font
    font
      ...
        h3
         font
           font
             ...

As a result, the font size increments with each successive h3, and a sizable (ha!) catastrophe ensues. Note that the font elements are irrelevant since none of them define a size attribute.

The main takeaway from all this?

Validate your freakin' markup.2 In particular, close all your freakin' tags (except where closing tags are forbidden).


Although the page and your example use the HTML 3.2 doctype, which triggers quirks mode, it should be noted that this behavior is consistent in both quirks mode and standards mode. In fact, the HTML5 spec contains a section entirely dedicated to parsing and DOM tree construction in order to set in stone various browser behaviors with respect to invalid markup (for compatibility with legacy markup and all that). Browsers are expected to follow this specification even in standards mode, hence the consistent behavior in both modes in most browsers.

In there is a subsection containing a rule on how to handle this specific situation:

A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"

If the stack of open elements has a p element in button scope, then act as if an end tag with the tag name "p" had been seen.

If the current node is an element whose tag name is one of "h1", "h2", "h3", "h4", "h5", or "h6", then this is a parse error; pop the current node off the stack of open elements.

Insert an HTML element for the token.

This means if the parser encounters a heading tag only while currently in an open heading element, then a parse error is thrown and it should close the previously-opened heading element, before entering this new heading element, which is what's happening with your example. Otherwise, nothing special happens (i.e. the parser should continue as usual).

That said, please don't rely on this. A parse error is still an error; be nice to the parser and don't throw errors at it just because you can. Just write valid code and you'll be fine. Of course, when browsers continue to mess up even after you've validated your code, then you can worry.


1Which, incidentally, was my first HTML editor too... I was 9.

2Don't overdo it, but don't neglect to do it either.

like image 147
BoltClock Avatar answered Oct 13 '22 01:10

BoltClock