Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx, Substituting a variable number of replacements

Tags:

regex

perl

Hopefully I'm missing something obvious.

I've got a file that contains some lines like:

| A | B | C |
|-----------|
Ignore this line
| And | Ignore | This |
| D | E | F | G |
|---------------|

I want to find the |----| lines, remove those... and replace all of the | characters with a ^ in the preceding line. e.g.

^ A ^ B ^ C ^
Ignore this line
| And | Ignore | This |
^ D ^ E ^ F ^ G ^

So far I've got:

perl -0pe 's/^(\|.*\|)\n\|-+\|/$1/mg'

This takes input from stdin (some other modifications have already happened with sed)... and it's using -0 and /m to support multiline replacements.

The match seems to be correct, and it removes the |----| lines, but I can't see how I can do the | to ^ substitution with the $1 (or \1) backreference.

I can't remember where I did it before, but another language allowed me to use ${1/A/B} to substitute A to B, but that's upsetting perl.

And I've been wondering if this is where /e or /ee could be used, but I'm not familiar enough with perl on how to do that.

like image 439
Craig Francis Avatar asked May 28 '21 18:05

Craig Francis


1 Answers

You can use

perl -0pe 's{^(.*)\R\|-+\|$\R?}{$1 =~ s,\|,^,gr}gme' t

Details:

  • ^(.*)\R\|-+\|$\R? - matches all occurrences (see the g flag at the end)
    • ^ - start of a line (note the m flag that makes ^ match start of a line and $ match end of a line)
    • (.*) - Group 1: whole line
    • \R - a line break sequence
    • \| - | char
    • -+ - one or more - chars
    • \| - a | char
    • $ - end of line
    • \R? - an optional line break sequence.

Once the match is found, all | are replaced with ^ using $1 =~ s,\|,^,gr, that replaces inside the Group 1 value. This syntax is enabled with the e flag.

like image 114
Wiktor Stribiżew Avatar answered Oct 27 '22 00:10

Wiktor Stribiżew