Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reduce multiple blank lines to single (Pythonically)

How can I reduce multiple blank lines in a text file to a single line at each occurrence?

I have read the entire file into a string, because I want to do some replacement across line endings.

with open(sourceFileName, 'rt') as sourceFile:
    sourceFileContents = sourceFile.read()

This doesn't seem to work

while '\n\n\n' in sourceFileContents:
    sourceFileContents = sourceFileContents.replace('\n\n\n', '\n\n')

and nor does this

sourceFileContents = re.sub('\n\n\n+', '\n\n', sourceFileContents)

It's easy enough to strip them all, but I want to reduce multiple blank lines to a single one, each time I encounter them.

I feel that I'm close, but just can't get it to work.

like image 908
Mawg says reinstate Monica Avatar asked Mar 06 '15 14:03

Mawg says reinstate Monica


People also ask

How do you delete multiple new lines in Python?

Method 2: Use the strip() Function to Remove a Newline Character From the String in Python. The strip() method in-built function of Python is used to remove all the leading and trailing spaces from a string. Our task can be performed using strip function() in which we check for “\n” as a string in a string.

How do I remove blank lines from a string?

^\s+$ will remove everything from the first blank line to the last (in a contiguous block of empty lines), including lines that only contain tabs or spaces. [\r\n]* will then remove the last CRLF (or just LF which is important because the . NET regex engine matches the $ between a \r and a \n , funnily enough).

How do you get rid of blank lines in Python?

strip() != "" to remove any empty lines from lines . Declare an empty string and use a for-loop to iterate over the previous result.


1 Answers

This is a reach, but perhaps some of the lines aren't completely blank (i.e. they have only whitespace characters that give the appearance of blankness). You could try removing all possible whitespace between newlines.

re.sub(r'(\n\s*)+\n+', '\n\n', sourceFileContents)

Edit: realized the second '+' was superfluous, as the \s* will catch newlines between the first and last. We just want to make sure the last character is definitely a newline so we don't remove leading whitespace from a line with other content.

re.sub(r'(\n\s*)+\n', '\n\n', sourceFileContents)

Edit 2

re.sub(r'\n\s*\n', '\n\n', sourceFileContents)

Should be an even simpler solution. We really just want to a catch any possible space (which includes intermediate newlines) between our two anchor newlines that will make the single blank line and collapse it down to just the two newlines.

like image 152
Marc Chiesa Avatar answered Sep 27 '22 21:09

Marc Chiesa