Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Regex: Get Index of Capture

Tags:

regex

ruby

I've seen this question asked and answered for javascript regex, and the answer was long and very ugly. Curious if anyone has a cleaner way to implement in ruby.

Here's what I'm trying to achieve:

Test String: "foo bar baz"
Regex: /.*(foo).*(bar).*/
Expected Return: [[0,2],[4,6]]

So my goal is to be able to run a method, passing in the test string and regex, that will return the indices where each capture group matched. I have included both the starting and ending indices of the capture groups in the expected return. I'll be working on this and adding my own potential solutions here along the way too. And of course, if there's a way other than regex that would be cleaner/easier to achieve this, that's a good answer too.

like image 896
Jeff Escalante Avatar asked Dec 26 '22 00:12

Jeff Escalante


1 Answers

Something like this should work for a general amount of matches.

def match_indexes(string, regex)
  matches = string.match(regex)

  (1...matches.length).map do |index|
    [matches.begin(index), matches.end(index) - 1]
  end
end

string = "foo bar baz"

match_indexes(string, /.*(foo).*/)
match_indexes(string, /.*(foo).*(bar).*/)
match_indexes(string, /.*(foo).*(bar).*(baz).*/)
# => [[0, 2]]
# => [[0, 2], [4, 6]]
# => [[0, 2], [4, 6], [8, 10]]

You can have a look at the (kind of strange) MatchData class for how this works. http://www.ruby-doc.org/core-1.9.3/MatchData.html

like image 88
Tal Avatar answered Jan 09 '23 11:01

Tal