I'm editing books/articles in HTML. These texts were printed once and I scan them, convert them into an intermediate XML-Format and then I transform them into HTML (by XSLT). Because some of those texts are extinct from the market today and are only available through the major libraries I want to publish them in a way so that people could possibly cite them by referring to the page numbers in the original document. For this purpose my intermediate XML-format has an element that marks a page-break. Right now I'm working on the XML->HTML transformations and I'm wondering myself how to transform these page breaks in HTML. They should not appear in the final HTML by default (so a simple | doesn't fit) but I plan to wrap these documents with some lightweight JavaScript that will show the markers when needed. I thought about <span>
s with a | in it that are hidden by default.
Is there a better, possibly 'semantic' way to this problem?
Page breaks are very much a thing of layout, and HTML isn't designed to describe layout, so you aren't going to find anything that is semantic for this within the language.
The best you can hope for is some sort of kludge.
Since a page break can occur in the middle of a paragraph, and <p>
elements can contain only inline elements you can eliminate most of the options from the outset.
The two possibilities that suggest themselves to me are <span>
and <a>
. The former has no semantics, that latter is designed to be linked to (with a name attribute) or from (with an href attribute), and you could consider a page from an original document something that you might wish to link to.
No matter what element you use, I wouldn't include a marker in it and then hide it with CSS. That sort of presentational flag is something I would consider adding via :before
in a stylesheet (combined with a descendent selector for a body class that can be toggled with JS since you want the toggle)
Alternatively, if you want to take a (very) broad view of the meaning of "HTML" you could consider the l element (from the defunct XHTML 2 drafts) and markup each line of the original document. Adding a class would indicate where a new page began (and you could use CSS counters and borders to clearly indicate each page and number it should you so wish). Pity the browser vendors refused to get behind a real semantic markup language and favoured HTML 5 instead.
Use a <div class="Page">
for each page, and have a stylesheet containing:
.Page {
page-break-after: always;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With