Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting a string with no line breaks into a list of lines with a maximum column count

I have a long string (multiple paragraphs) which I need to split into a list of line strings. The determination of what makes a "line" is based on:

  • The number of characters in the line is less than or equal to X (where X is a fixed number of columns per line_)
  • OR, there is a newline in the original string (that will force a new "line" to be created.

I know I can do this algorithmically but I was wondering if python has something that can handle this case. It's essentially word-wrapping a string.

And, by the way, the output lines must be broken on word boundaries, not character boundaries.

Here's an example of input and output:

Input:

"Within eight hours of Wilson's outburst, his Democratic opponent, former-Marine Rob Miller, had received nearly 3,000 individual contributions raising approximately $100,000, the Democratic Congressional Campaign Committee said.

Wilson, a conservative Republican who promotes a strong national defense and reining in the size of government, won a special election to the House in 2001, succeeding the late Rep. Floyd Spence, R-S.C. Wilson had worked on Spence's staff on Capitol Hill and also had served as an intern for Sen. Strom Thurmond, R-S.C."

Output:

"Within eight hours of Wilson's outburst, his"
"Democratic opponent, former-Marine Rob Miller,"
" had received nearly 3,000 individual "
"contributions raising approximately $100,000,"
" the Democratic Congressional Campaign Committee"
" said."
""
"Wilson, a conservative Republican who promotes a "
"strong national defense and reining in the size "
"of government, won a special election to the House"
" in 2001, succeeding the late Rep. Floyd Spence, "
"R-S.C. Wilson had worked on Spence's staff on "
"Capitol Hill and also had served as an intern"
" for Sen. Strom Thurmond, R-S.C."
like image 450
Karim Avatar asked Sep 10 '09 16:09

Karim


2 Answers

EDIT

What you are looking for is textwrap, but that's only part of the solution not the complete one. To take newline into account you need to do this:

from textwrap import wrap
'\n'.join(['\n'.join(wrap(block, width=50)) for block in text.splitlines()])

>>> print '\n'.join(['\n'.join(wrap(block, width=50)) for block in text.splitlines()])

Within eight hours of Wilson's outburst, his
Democratic opponent, former-Marine Rob Miller, had
received nearly 3,000 individual contributions
raising approximately $100,000, the Democratic
Congressional Campaign Committee said.

Wilson, a conservative Republican who promotes a
strong national defense and reining in the size of
government, won a special election to the House in
2001, succeeding the late Rep. Floyd Spence,
R-S.C. Wilson had worked on Spence's staff on
Capitol Hill and also had served as an intern for
Sen. Strom Thurmond
like image 149
Nadia Alramli Avatar answered Sep 22 '22 11:09

Nadia Alramli


You probably want to use the textwrap function in the standard library:

http://docs.python.org/library/textwrap.html

like image 35
Paul McMillan Avatar answered Sep 20 '22 11:09

Paul McMillan