Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove single line breaks, keep "empty" lines

Say I have text like the following text selected with the cursor:

This is a test. 
This 
is a test.

This is a test. 
This is a 
test.

I would like to transform it into:

This is a test. This is a test

This is a test. This is a test

In other words, I would like to replace single line breaks by spaces, leaving empty lines alone.

I thought something like the following would work:

RemoveSingleLineBreaks()
{
  ClipSaved := ClipboardAll
  Clipboard =
  send ^c
  Clipboard := RegExReplace(Clipboard, "([^(\R)])(\R)([^(\R)])", "$1$3")    
  send ^v
  Clipboard := ClipSaved
  ClipSaved = 
}

But it doesn't. If I apply it to the text above, it yields:

This is a test. This is a test.
This is a test. This is a test.

which also removed the "empty line" in the middle. This is not what I want.

To clarify: By an empty line I mean any line with "white" characters (e.g. tabs or white spaces)

Any thoughts how to do this?

like image 203
Amelio Vazquez-Reina Avatar asked May 05 '12 18:05

Amelio Vazquez-Reina


People also ask

How do I delete unnecessary lines?

Delete lines or connectorsClick the line, connector, or shape that you want to delete, and then press Delete. Tip: If you want to delete multiple lines or connectors, select the first line, press and hold Ctrl while you select the other lines, and then press Delete.

How do you get rid of blank lines in text?

Click Search and then Replace. In the Replace window, in the Find what section, type ^\n (caret, backslash 'n') and leave the Replace with section blank, unless you want to replace a blank line with other text. Check the Regular Expression box. Click the Replace All button to replace all blank lines.

What is a empty line?

: a line on a document that marks where one should write something. Sign your name on the blank line.


2 Answers

RegExReplace(Clipboard, "([^\r\n])\R(?=[^\r\n])", "$1$2")

This will strip single line breaks assuming the new line token contains either a CR or a LF at the end (e.g. CR, LF, CR+LF, LF+CR). It does not count whitespace as empty.

Your main problem was the use of \R:

\R inside a character class is merely the letter "R" [source]

The solution is to use the CR and LF characters directly.


To clarify: By an empty line I mean any line with "white" characters (e.g. tabs or white spaces)

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1")

This is the same as the above one, but counts whitespace as empty. It works because it accepts all characters except line breaks non-greedily (*?) up to the first non-whitespace character both behind and in front of the linebreaks, since the . does not match line breaks by default.

A lookahead is used to avoid 'eating' (matching) the next character, which can break on single-character lines. Note that since it is not matched, it is not replaced and we can leave it out of the replacement string. A lookbehind cannot be used because PCRE does not support variable-length lookbehinds, so a normal capture group and backreference are used there instead.


I would like to replace single line breaks by spaces, leaving empty lines alone.

If you want to replace the line break with spaces, this is more appropriate:

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1 ")

This will replace single line breaks with a space.


And if you wanted to use lookbehinds and lookaheads:


Strip single line breaks:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", "")


Replace single line breaks with spaces:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", " ")

For some reason, \S doesn't seem to work in lookbehinds and lookaheads. At least, not with my testing.

like image 97
Bob Avatar answered Oct 03 '22 17:10

Bob


Clipboard := RegExReplace(Clipboard, "(\S+)\R", "$1 ")
like image 39
mihai Avatar answered Oct 03 '22 16:10

mihai