Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding N line breaks in a paragraph for the narrowest result

Let's say we have a paragraph such as this one:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Assuming a fixed width font, we want to add exactly N line breaks (by replacing space characters only) to produce a N+1 line block of text.

Here's an example of output for N=8, we get a max line width of 51:

Lorem ipsum dolor sit amet, consectetur adipiscing 
elit, sed do eiusmod tempor incididunt ut labore 
et dolore magna aliqua. Ut enim ad minim veniam, 
quis nostrud exercitation ullamco laboris nisi ut 
aliquip ex ea commodo consequat. Duis aute irure 
dolor in reprehenderit in voluptate velit esse 
cillum dolore eu fugiat nulla pariatur. Excepteur 
sint occaecat cupidatat non proident, sunt in culpa 
qui officia deserunt mollit anim id est laborum.

How do we find which space characters to replace with line breaks to obtain the narrowest (least number of characters on the longest line) with the fewest attempts?

like image 604
Michel Rouzic Avatar asked Jun 02 '16 15:06

Michel Rouzic


People also ask

What is \n line break?

The line end (EOL), line break, line feed, line break or carriage return character is a control character, which is the end of a text line, and the following character should start in a different line. It's \r\n on the Windows system, and it's Linux on \n.

What is the purpose of adding a line break in the paragraph?

Line breaks serve an important function in setting the rhythm of a poem, since they insert a pause between the final word of one line and the first word of the next line.

How do you insert a line break in a paragraph?

Line Breaks - Hold Shift and Press Enter.

What happens when you insert a line break?

To start a new line within a paragraph, you can insert a line break in Word. This lets you jump to the next line, without having to change the set paragraph formatting or starting a new bullet point.


1 Answers

Suppose the text consists of m words, which we will number from 1 to m. Define f(i, j) to be the maximum width (in characters) of any line in an optimal (minimal-width) solution to the subproblem consisting of just the first i words, under the restriction that exactly j line breaks are used. Then the width of the best possible sequence of breaks to the entire problem will be given by f(m, n). This can be solved fairly efficiently using dynamic programming.

Let the total length in characters of the fragment between word i and word j >= i, inclusive, be len(i, j). (This is easy to calculate: Just compute an array of m+1 values len0[j], for 0 <= j <= m, each giving the total length in characters of the fragment consisting of the first j words; then len(i, j) is just len0[j] - len0[i-1] - 1, with the convention that len0[0] = -1.)

The base case is easy:

f(i, 0) = len(0, i)   (i.e., if there are no line breaks)

The recursive case is:

f(i, j) = the minimum over all 0 <= k < i of max(f(k, j-1), len(k+1, i))

That is, to find the best way to break the first i words into j+1 lines (i.e. using j line breaks), we can try the following for every shorter k-word prefix: determine the best way to break that k-word prefix into j lines (i.e. using j-1 line breaks), and compare the maximum width we get from that with the width that results from putting the rest of the i-k words on a single line at the end. Each prefix gives us a different candidate solution, so we can pick the best of all of them.

Constructing a solution

Now that we can calculate the optimal width f(m, n), how can we use this to actually construct a solution? Fortunately there is a standard technique in dynamic programming for this. The fastest way is to record, during calculation of f(i, j), the (actually a, since there may in general be multiple optimal solutions) value of k that produced the minimum in a predecessor table pred[i][j]. Having calculated f(m, n) and filled in the predecessor table, we can then construct an optimal solution by walking backwards through it: pred[i][j] tells us a value k such that we can produce an optimal solution by adding a line break after word k, so add a line break there, and then look at pred[k][j-1] to find the position of the previous line break, continuing on until j reaches 0.

Time and space complexity

If the recursion is memoised using dynamic programming, then there are at most O(mn) different parameter combinations that f() could be called with (i ranges between 0 and m, and j ranges between 0 and n), and the time spent outside of recursive calls is O(m) (k can range between 0 and m, and the computation for each value of k is O(1)), so the time complexity of this solution is O(nm^2). The space complexity is O(mn), but by switching to a bottom-up DP this can easily be reduced to O(m), since while computing f(i, j) we only ever need access to values of f() for which the second parameter is j-1, meaning it suffices to actually store just the size-(m+1) array of computed values f(q, j-1) for 0 <= q <= m.

like image 182
j_random_hacker Avatar answered Oct 31 '22 09:10

j_random_hacker