Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't Google close td and tr tags in tables?

Tags:

html

Looking at HTML source code of

http://www.google.com/finance/historical?cid=983582&startdate=Nov+28,+2000&enddate=Nov+27,+2010&num=200

I see that Google never closes td and tr tags. There is no </tr> no </td> in the source.

Why?

<tr class=bb>
<th class="bb lm">Date
<th class="rgt bb">Open
<th class="rgt bb">High
<th class="rgt bb">Low
<th class="rgt bb">Close
<th class="rgt bb rm">Volume
<tr>
<td class="lm">Nov 26, 2010
<td class="rgt">11,183.50
<td class="rgt">11,183.50
<td class="rgt">11,067.17
<td class="rgt">11,092.00
<td class="rgt rm">68,396,121
<tr>

Is it to make it harder to parse it because XML parser won't be able to read it ? I have remarked that &output=csv is not available for indices (this url won't work: http://www.google.com/finance?q=INDEXDJX:.DJI&output=csv) whereas it is available for stock (http://www.google.com/finance/historical?q=NASDAQ:GOOG&output=csv will work) so that to get historical data in csv for indices you have to do the parsing job !

like image 837
Rebol Tutorial Avatar asked Nov 27 '10 19:11

Rebol Tutorial


1 Answers

This is HTML4 (and not XML). As pointed out in the W3 specs:

11.2.6 Table cells: The TH and TD elements

Start tag: required, End tag: optional

Ditto for tr:

11.2.5 Table rows: The TR element

Start tag: required, End tag: optional

I believe the intent is to minimize page size by omitting the end tags. They do various additional optimizations which may actually result in invalid HTML, but are handled by browsers in tagsoup mode.

like image 107
Sinan Ünür Avatar answered Sep 20 '22 12:09

Sinan Ünür