Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace specific capture group instead of entire regex in Perl

I've got a regular expression with capture groups that matches what I want in a broader context. I then take capture group $1 and use it for my needs. That's easy.

But how to use capture groups with s/// when I just want to replace the content of $1, not the entire regex, with my replacement?

For instance, if I do:

$str =~ s/prefix (something) suffix/42/

prefix and suffix are removed. Instead, I would like something to be replaced by 42, while keeping prefix and suffix intact.

like image 787
flohei Avatar asked Aug 26 '12 14:08

flohei


People also ask

How do Capturing groups work in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

What is non capturing group in regex?

Non-capturing groups are important constructs within Java Regular Expressions. They create a sub-pattern that functions as a single unit but does not save the matched character sequence. In this tutorial, we'll explore how to use non-capturing groups in Java Regular Expressions.

How do I search and replace in Perl?

Performing a regex search-and-replace is just as easy: $string =~ s/regex/replacement/g; I added a “g” after the last forward slash. The “g” stands for “global”, which tells Perl to replace all matches, and not just the first one.

What is first capturing group in regex?

First group matches abc. Escaped parentheses group the regex between them. They capture the text matched by the regex inside them into a numbered group that can be reused with a numbered backreference. They allow you to apply regex operators to the entire grouped regex.


3 Answers

If you only need to replace one capture then using @LAST_MATCH_START and @LAST_MATCH_END (with use English; see perldoc perlvar) together with substr might be a viable choice:

use English qw(-no_match_vars);
$your_string =~ m/aaa (bbb) ccc/;
substr $your_string, $LAST_MATCH_START[1], $LAST_MATCH_END[1] - $LAST_MATCH_START[1], "new content";
# replaces "bbb" with "new content"
like image 172
Moritz Bunkus Avatar answered Oct 12 '22 23:10

Moritz Bunkus


As I understand, you can use look-ahead or look-behind that don't consume characters. Or save data in groups and only remove what you are looking for. Examples:

With look-ahead:

s/your_text(?=ahead_text)//;

Grouping data:

s/(your_text)(ahead_text)/$2/;
like image 24
Birei Avatar answered Oct 12 '22 21:10

Birei


This is an old question but I found the below easier for replacing lines that start with >something to >something_else. Good for changing the headers for fasta sequences

  while ($filelines=~ />(.*)\s/g){
        unless ($1 =~ /else/i){
                $filelines =~ s/($1)/$1\_else/;
        }

  }
like image 44
Jabda Avatar answered Oct 12 '22 21:10

Jabda