Regular expression with back reference

Question

Can anybody explain how exactly the back reference works in ruby regular expression? I particularly want to know exactly how (..) grouping works. For example:

s = /(..) [cs]\1/.match("The cat sat in the hat")

puts s

for the code snippet above, the output is: at sat. Why/How is it getting this output ?

maerics · Accepted Answer

Here is what this regular expression means:

regex = /(..) [cs]\1/
#        ├──┘ ├──┘├┘
#        │    │   └─ A reference to whatever was in the first matching group.
#        │    └─ A "character class" matching either "c" or "s".
#        └─ A "matching group" referenced by "\1" containing any two characters.

Note that after matching a regular expression with a matching group, the special variables $1 ($2, etc) will contain what matched.

/(..) [cs]\1/.match('The cat sat in the hat') # => #<MatchData...>
$1 # => "at"

Note also that the Regexp#match method returns a MatchData object, which contains the string which caused the entire match ("at sat", aka $&) and then each matching group ("at", aka $1):

/(..) [cs]\1/.match('The cat sat in the hat')
=> #<MatchData "at sat" 1:"at">

pje · Answer

Firstly, the output of puts s isn't the capture groups:

s = /(..) [cs]\1/.match("The cat sat in the hat")
puts s
# at sat

If you want to access its capture groups, you should be using MatchData.captures:

s = /(..) [cs]\1/.match("The cat sat in the hat")
s.captures
# => ["at"]

Regular expression with back reference

Tags:

regex

ruby

K M Rakibul Islam

2 Answers

maerics

pje

Recent Activity

Donate For Us

Regular expression with back reference

Tags:

regex

ruby

K M Rakibul Islam

2 Answers

maerics

pje

Related questions

Recent Activity

Donate For Us