Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl: regular expression: capturing group

Tags:

regex

perl

In a code file, I want to remove any (one or more) consecutive white lines (lines that may include only zero or more spaces/tabs and then a newline) that go between a code text and the concluding } of a block. This concluding } may have spaces for indentation before it, so I want to keep them.

Here is what I try to do:

perl -i -0777 -pe 's/\s+\n([ ]*)\}/\n($1)\}/g' file

For example, if my code file looks like (□ is the space character):

□□□□while (true) {\n
□□□□□□□□print("Yay!");□□□□□□\n
□□□□□□□□□□□□□□□□\n
□□□□}\n

Then I want it to become:

□□□□while (true) {\n
□□□□□□□□print("Yay!");\n
□□□□}\n

However it does not do the change I expected. Any idea what I am doing wrong here?

like image 934
rapt Avatar asked Mar 06 '26 20:03

rapt


2 Answers

The only issues I can see with your regex are

  • you don't need the parenthesis around the matching variable, and
  • the use of a character class when extracting the match is redundant (unless you want to match tabs as well as spaces).

So, you could try

s/\s+\n( *)\}/\n$1\}/g

instead.

This works as expected when run on your test input.

To tidy it up even more, you could try the following.

s/\s+(\n *\})/$1/g

If there might be tabs as well as spaces, you can use a character class. (You do not need to include '|' inside the character class).

s/\s+(\n[ \t]*\})/$1/g
like image 51
David Collins Avatar answered Mar 08 '26 19:03

David Collins


perl -pi -0777 -e's/^\s*\n(?=\s*})//mg' yourfile

(Remove whitespace from the beginning of a line through a newline that precedes a line with } as the first non-whitespace.)

like image 23
ysth Avatar answered Mar 08 '26 21:03

ysth