Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How \1 works in perl regular expressions?

Tags:

regex

perl

I'm trying to match a string like (something is something).

$_ = "anna is ann";
if (/([a-zA-Z]+) is \1/) {
    print "matched\n";
}

I expected this not to work, but it works. Why?

like image 447
user1289 Avatar asked Dec 01 '25 00:12

user1289


2 Answers

Try print $1; or print $&; - /([a-zA-Z]+) is \1/ matches the a is a substring of anna is ann. If you want to restrict the match, you might want to anchor to the beginning and/or end of the string (or line, under /m) with ^ resp. $, or use the word boundary \b if you want to match within a longer string. So:

/^([a-zA-Z]+) is \1$/ will match "anna is anna" but not "anna is ann" or "anna is anna ".

/\b([a-zA-Z]+) is \1\b/ will match "x anna is anna y" and "sue-ann is ann-marie" but not "anna is ann", "anna is anne", or "anna is annabelle".

like image 121
haukex Avatar answered Dec 03 '25 15:12

haukex


It matches 6 chars starting at pos 3 (a is a). Perhaps you should have used

/^([a-zA-Z]+) is \1\z/

  1. Starting a pos 0.
    1. [a-zA-Z]+ matches 4 chars at pos 0.
      1. is matches 4 chars at pos 4.
      2. \1 doesn't match at pos 8: Backtrack.
    2. [a-zA-Z]+ matches 3 chars at pos 0.
      1. is doesn't match at pos 3: Backtrack.
    3. [a-zA-Z]+ matches 2 chars at pos 0.
      1. is doesn't match at pos 2: Backtrack.
    4. [a-zA-Z]+ matches 1 chars at pos 0.
      1. is doesn't match at pos 1: Backtrack.
  2. Starting a pos 1.
    1. [a-zA-Z]+ matches 3 chars at pos 1.
      1. is matches 4 chars at pos 4.
      2. \1 doesn't match at pos 8: Backtrack.
    2. [a-zA-Z]+ matches 2 chars at pos 1.
      1. is doesn't match at pos 3: Backtrack.
    3. [a-zA-Z]+ matches 1 chars at pos 1.
      1. is doesn't match at pos 2: Backtrack.
  3. Starting a pos 2.
    1. [a-zA-Z]+ matches 2 chars at pos 2.
      1. is matches 4 chars at pos 4.
      2. \1 doesn't match at pos 8: Backtrack.
    2. [a-zA-Z]+ matches 1 chars at pos 2.
      1. is doesn't match at pos 3: Backtrack.
  4. Starting a pos 3.
    1. [a-zA-Z]+ matches 1 char at pos 3.
      1. is matches 4 chars at pos 4.
      2. \1 matches 1 char at pos 8.
      3. Match! (6 chars starting at pos 3)
like image 22
ikegami Avatar answered Dec 03 '25 14:12

ikegami



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!