Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression with back reference

Tags:

regex

ruby

Can anybody explain how exactly the back reference works in ruby regular expression? I particularly want to know exactly how (..) grouping works. For example:

s = /(..) [cs]\1/.match("The cat sat in the hat")

puts s 

for the code snippet above, the output is: at sat. Why/How is it getting this output ?

like image 922
K M Rakibul Islam Avatar asked Dec 01 '22 05:12

K M Rakibul Islam


2 Answers

Here is what this regular expression means:

regex = /(..) [cs]\1/
#        ├──┘ ├──┘├┘
#        │    │   └─ A reference to whatever was in the first matching group.
#        │    └─ A "character class" matching either "c" or "s".
#        └─ A "matching group" referenced by "\1" containing any two characters.

Note that after matching a regular expression with a matching group, the special variables $1 ($2, etc) will contain what matched.

/(..) [cs]\1/.match('The cat sat in the hat') # => #<MatchData...>
$1 # => "at"

Note also that the Regexp#match method returns a MatchData object, which contains the string which caused the entire match ("at sat", aka $&) and then each matching group ("at", aka $1):

/(..) [cs]\1/.match('The cat sat in the hat')
=> #<MatchData "at sat" 1:"at"> 
like image 168
maerics Avatar answered Dec 06 '22 09:12

maerics


Firstly, the output of puts s isn't the capture groups:

s = /(..) [cs]\1/.match("The cat sat in the hat")
puts s
# at sat

If you want to access its capture groups, you should be using MatchData.captures:

s = /(..) [cs]\1/.match("The cat sat in the hat")
s.captures
# => ["at"]
like image 24
pje Avatar answered Dec 06 '22 08:12

pje