Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide HTML into pages, split long paragraphs

Well I'm not even sure if this can be done with my current approach. I'm trying to fit the contents of an HTML document into pages that are the size of the current viewport. I'm currently doing this by iterating through all of the document's elements and checking whether their top offset is within the current page's boundary, whenever it is not, this offset becomes the start of a new page and the page boundary is set to this offset plus the viewport's height.

The problem I'm facing is that often there will be an element (a paragraph, for example) whose height is larger than the viewport itself, so even if the algorithm places this element at the start of a new page, its contents will overflow. I'm trying to find a way to split such elements in a way that the first slice will occupy the remaining part of the page.

This presents further difficulties. Even if I could find a way to determine how much of a paragraph's text still fits within the remainder of a page, and this in itself has proven to be quite difficult, I would still have the problem of the DOM no updating immediately after splitting the paragraph, which would mess up the calculation of the next page or at least force me to break the recursion which would complicate the algorithm even more.

Any suggestions on how to split a paragraph in a way that the first slice takes up the remaining space on the page are welcome. This is my code so far:

EDIT: It is worth noting that this would only work on very simple HTML on which there are no absolutely positioned or floated elements. This is not a problem in my case.

var elementIndex = -1;
var pages = 1;
var pageBoundary = 0;
var pageBreaks = [];

function calculatePages() {
    //first page boundary is window height
    pageBoundary = $(window).height();
    //do calculations
    iterateElements($("body"));
    //print results
    console.log(pageBreaks);
}

function iterateElements(parent) {

    $.each($(parent).children(), function(i, e) {
        //increase current element index
        elementIndex = elementIndex + 1;
        //get element's top offset
        var offsetTop = $(e).offset().top;
        //get the last position that the element occupies
        var elementSpan = offsetTop + $(e).outerHeight();

        if ($(e).children().length == 0) { //only leaf nodes will be set as page breaks
            //element's start position is outside page boundary
            if (offsetTop >= pageBoundary) {
                //mark page start with red in order to visualize
                $(e).attr("style", "border-top: 1px solid red");

                //increase page count
                pages = pages + 1;
                //new page starts at element's top, next page boundary
                //is element's starting position plus the viewport's height
                pageBoundary = offsetTop + $(window).height();
                //store index of page break
                pageBreaks.push(elementIndex);
            }
            //element's start position is inside current page, but contents overflow
            else if (elementSpan >= pageBoundary) {
                //NO IDEA WHAT TO DO HERE
                //NEED A WAY TO SPLIT LARGE ELEMENTS
            }
        } 

        iterateElements(e);
    });
}

$(function() {
    calculatePages();
});
like image 540
JayPea Avatar asked Mar 31 '13 18:03

JayPea


1 Answers

I have done something similar to this. The approach I took is to check the height of the page container. If it was greater than the max, I know elements need to be moved to the next page.

If there are multiple elements, I can move the last element to the next page.

If there is only 1 element, it needs to be split. Let's call this element X. So you can create a new paragraph/section in the next page, let's call that element Y. You can now move words or characters from the end of element X to the start of element Y until the height of element X fits into the page.

After this you can repeat for the next page.

like image 107
Ali Gangji Avatar answered Oct 10 '22 23:10

Ali Gangji