How can I reduce multiple blank lines in a text file to a single line at each occurrence?
I have read the entire file into a string, because I want to do some replacement across line endings.
with open(sourceFileName, 'rt') as sourceFile:
sourceFileContents = sourceFile.read()
This doesn't seem to work
while '\n\n\n' in sourceFileContents:
sourceFileContents = sourceFileContents.replace('\n\n\n', '\n\n')
and nor does this
sourceFileContents = re.sub('\n\n\n+', '\n\n', sourceFileContents)
It's easy enough to strip them all, but I want to reduce multiple blank lines to a single one, each time I encounter them.
I feel that I'm close, but just can't get it to work.
Method 2: Use the strip() Function to Remove a Newline Character From the String in Python. The strip() method in-built function of Python is used to remove all the leading and trailing spaces from a string. Our task can be performed using strip function() in which we check for “\n” as a string in a string.
^\s+$ will remove everything from the first blank line to the last (in a contiguous block of empty lines), including lines that only contain tabs or spaces. [\r\n]* will then remove the last CRLF (or just LF which is important because the . NET regex engine matches the $ between a \r and a \n , funnily enough).
strip() != "" to remove any empty lines from lines . Declare an empty string and use a for-loop to iterate over the previous result.
This is a reach, but perhaps some of the lines aren't completely blank (i.e. they have only whitespace characters that give the appearance of blankness). You could try removing all possible whitespace between newlines.
re.sub(r'(\n\s*)+\n+', '\n\n', sourceFileContents)
Edit: realized the second '+' was superfluous, as the \s* will catch newlines between the first and last. We just want to make sure the last character is definitely a newline so we don't remove leading whitespace from a line with other content.
re.sub(r'(\n\s*)+\n', '\n\n', sourceFileContents)
Edit 2
re.sub(r'\n\s*\n', '\n\n', sourceFileContents)
Should be an even simpler solution. We really just want to a catch any possible space (which includes intermediate newlines) between our two anchor newlines that will make the single blank line and collapse it down to just the two newlines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With