I'm going through about_regular_expressions.rb and don't understand exactly what's happening here:
def test_variables_can_also_be_used_to_access_captures
assert_equal "Gray, James", "Name: Gray, James"[/(\w+), (\w+)/]
assert_equal "Gray", $1
assert_equal "James", $2
end
It seems to me like the use of the parentheses in the regular expression creates two new variables under the hood ($1 and $2).
Is this correct?
But then I did this:
def test_variables_can_also_be_used_to_access_captures
assert_equal "Gray, James", "Name: Gray, James"[/(\w+), (\w+)/]
assert_equal "Smith, Bobert", "Name: Smith, Bobert"[/(\w+), (\w+)/]
assert_equal "Smith", $1
assert_equal "Bobert", $2
end
And it captured "Smith" and "Bobert". I guess the previous values were just overwritten each time a new regex with parentheses is used?
If I then try to capture just one word:
def test_variables_can_also_be_used_to_access_captures
assert_equal "Gray, James", "Name: Gray, James"[/(\w+), (\w+)/]
assert_equal "Smith, Bobert", "Name: Smith, Bobert"[/(\w+), (\w+)/]
assert_equal "Smith", $1
assert_equal "Bobert", $2
assert_equal "Susan,", "Name: Susan, whatever"[/(\w+),/]
assert_equal "Susan", $1
assert_equal nil, $2
end
$2 is gone... (no more "Bobert")
Can anyone shed some light about what happens under the hood? Or point me in the right direction?
You are right. Every time a regex is matched, the global variables $~, $&, ..., $1, $2, ...
are overwritten. In your last example, the regex does not have anything to match for $2
because it does not have a second (...)
position, so nil
was assigned for $2
.
When you want to interleavingly use the results from multiple matches, the technique I use is to keep the match data as variables. That is, immediately after first regex match, assign a variable match1 = $~
. Then, go on to the next regex match and do match2 = $~
, and so on. Later, you can extract the matched results from these variables. For example, after doing several regex matches, if you wanted to refer back to the result of the $1
that was assigned at the first regex match, you can call it by match1[1]
, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With