Can anybody explain how exactly the back reference works in ruby regular expression? I particularly want to know exactly how (..)
grouping works. For example:
s = /(..) [cs]\1/.match("The cat sat in the hat")
puts s
for the code snippet above, the output is: at sat
. Why/How is it getting this output ?
Here is what this regular expression means:
regex = /(..) [cs]\1/
# ├──┘ ├──┘├┘
# │ │ └─ A reference to whatever was in the first matching group.
# │ └─ A "character class" matching either "c" or "s".
# └─ A "matching group" referenced by "\1" containing any two characters.
Note that after matching a regular expression with a matching group, the special variables $1
($2
, etc) will contain what matched.
/(..) [cs]\1/.match('The cat sat in the hat') # => #<MatchData...>
$1 # => "at"
Note also that the Regexp#match
method returns a MatchData object, which contains the string which caused the entire match ("at sat", aka $&
) and then each matching group ("at", aka $1
):
/(..) [cs]\1/.match('The cat sat in the hat')
=> #<MatchData "at sat" 1:"at">
Firstly, the output of puts s
isn't the capture groups:
s = /(..) [cs]\1/.match("The cat sat in the hat")
puts s
# at sat
If you want to access its capture groups, you should be using MatchData.captures
:
s = /(..) [cs]\1/.match("The cat sat in the hat")
s.captures
# => ["at"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With