I have some text with hard line breaks in it like this:
This should all be on one line
since it's one sentence.
This is a new paragraph that
should be separate.
I want to remove the single newlines but keep the double newlines so it looks like this:
This should all be on one line since it's one sentence.
This is a new paragraph that should be separate.
Is there a single regular expression to do this? (or some easy way)
So far this is my only solution which works but feels hackish.
txt = txt.gsub(/(\r\n|\n|\r)/,'[[[NEWLINE]]]')
txt = txt.gsub('[[[NEWLINE]]][[[NEWLINE]]]', "\n\n")
txt = txt.gsub('[[[NEWLINE]]]', " ")
Replace all newlines that are not followed by or preceded by a newline:
text = <<END
This should all be on one line
since it's one sentence.
This is a new paragraph that
should be separate.
END
p text.gsub /(?<!\n)\n(?!\n)/, ' '
#=> "This should all be on one line since it's one sentence.\n\nThis is a new paragraph that should be separate. "
Or, for Ruby 1.8 without lookarounds:
txt.gsub! /([^\n])\n([^\n])/, '\1 \2'
text.gsub!(/(\S)[^\S\n]*\n[^\S\n]*(\S)/, '\1 \2')
The two (\S)
groups serve the same purposes as the lookarounds ((?<!\s)(?<!^)
and(?!\s)(?!$)
) in @sln's regexes:
[^\S\n]*\n[^\S\n]*
part consumes any other whitespace surrounding the linefeed, making it possible for us to normalize it to a single space.They also make the regex easier to read, and (perhaps most importantly) they work in pre-1.9 versions of Ruby that don't support lookbehinds.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With