Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to express a page break semantically correct in HTML?

Tags:

html

xhtml

I'm editing books/articles in HTML. These texts were printed once and I scan them, convert them into an intermediate XML-Format and then I transform them into HTML (by XSLT). Because some of those texts are extinct from the market today and are only available through the major libraries I want to publish them in a way so that people could possibly cite them by referring to the page numbers in the original document. For this purpose my intermediate XML-format has an element that marks a page-break. Right now I'm working on the XML->HTML transformations and I'm wondering myself how to transform these page breaks in HTML. They should not appear in the final HTML by default (so a simple | doesn't fit) but I plan to wrap these documents with some lightweight JavaScript that will show the markers when needed. I thought about <span>s with a | in it that are hidden by default.

Is there a better, possibly 'semantic' way to this problem?

like image 682
Struce Avatar asked Aug 02 '10 22:08

Struce


2 Answers

Page breaks are very much a thing of layout, and HTML isn't designed to describe layout, so you aren't going to find anything that is semantic for this within the language.

The best you can hope for is some sort of kludge.

Since a page break can occur in the middle of a paragraph, and <p> elements can contain only inline elements you can eliminate most of the options from the outset.

The two possibilities that suggest themselves to me are <span> and <a>. The former has no semantics, that latter is designed to be linked to (with a name attribute) or from (with an href attribute), and you could consider a page from an original document something that you might wish to link to.

No matter what element you use, I wouldn't include a marker in it and then hide it with CSS. That sort of presentational flag is something I would consider adding via :before in a stylesheet (combined with a descendent selector for a body class that can be toggled with JS since you want the toggle)

Alternatively, if you want to take a (very) broad view of the meaning of "HTML" you could consider the l element (from the defunct XHTML 2 drafts) and markup each line of the original document. Adding a class would indicate where a new page began (and you could use CSS counters and borders to clearly indicate each page and number it should you so wish). Pity the browser vendors refused to get behind a real semantic markup language and favoured HTML 5 instead.

like image 108
Quentin Avatar answered Oct 06 '22 00:10

Quentin


Use a <div class="Page"> for each page, and have a stylesheet containing:

.Page {
   page-break-after: always;
}
like image 37
dan04 Avatar answered Oct 06 '22 00:10

dan04