Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl. Regexp work only once

Tags:

regex

perl

I need to process the string with regexp and change x->y if is number around.

String: 2x2x2 2x 2x2x 2x2x2x2x2

Regexp: s/([0-9])x([0-9])/$1y$2/g

my $string = "2x2x2 2x 2x2x 2x2x2x2x2";

$string =~ s/([0-9])x([0-9])/$1y$2/g;

print "$string\n";

I expect: 2y2y2 xx 2x 2y2x 2y2y2y2y2

But result: 2y2x2 2x 2y2x 2y2x2y2x2 (not all 2x2 changed)

What should I do?

like image 737
taofos Avatar asked Oct 06 '12 07:10

taofos


People also ask

What is the meaning of $1 in Perl regex?

$1 equals the text " brown ".

What is * in Perl regex?

Regular Expression (Regex or Regexp or RE) in Perl is a special text string for describing a search pattern within a given text. Regex in Perl is linked to the host language and is not the same as in PHP, Python, etc. Sometimes it is termed as “Perl 5 Compatible Regular Expressions“.

What does =~ mean in Perl?

=~ is the Perl binding operator. It's generally used to apply a regular expression to a string; for instance, to test if a string matches a pattern: if ($string =~ m/pattern/) {

How do I match a pattern in Perl?

m operator in Perl is used to match a pattern within the given text. The string passed to m operator can be enclosed within any character which will be used as a delimiter to regular expressions.


2 Answers

Try the below regex:

s/(?<=\d)x(?=\d)/y/g
like image 83
xdazz Avatar answered Sep 22 '22 07:09

xdazz


To be explicit: the reason "2x2x2" turns into "2y2x2" is that your expression first matches "2x2", replaces it with "2y2", then resumes searching after that match was found. The next bit is "x2", which doesn't match your pattern.

The reason @xdazz's solution works is that look-around assertions don't actually consume characters of the string. The portion matched in the string is just "x", whenever preceded and followed by numerals.

Incidentally, @xdazz's change from [0-9] to \d doesn't really change much, but it's slightly different: \d will also match other Unicode characters that are considered digits, whereas [0-9] only matches the exact 10 characters in the given range.

like image 38
Ken Williams Avatar answered Sep 22 '22 07:09

Ken Williams