Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex with named capture groups getting all matches in Ruby

Tags:

regex

ruby

I have a string:

s="123--abc,123--abc,123--abc" 

I tried using Ruby 1.9's new feature "named groups" to fetch all named group info:

/(?<number>\d*)--(?<chars>\s*)/ 

Is there an API like Python's findall which returns a matchdata collection? In this case I need to return two matches, because 123 and abc repeat twice. Each match data contains of detail of each named capture info so I can use m['number'] to get the match value.

like image 888
mlzboy Avatar asked Jan 14 '11 15:01

mlzboy


People also ask

What method should you use when you want to get all sequences matching a regex pattern in a string?

To find all the matching strings, use String's scan method.

How do Capturing groups work in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

What does =~ mean in Ruby regex?

=~ is Ruby's pattern-matching operator. It matches a regular expression on the left to a string on the right. If a match is found, the index of first match in string is returned. If the string cannot be found, nil will be returned.

What are match groups regex?

Regular expressions allow us to not just match text but also to extract information for further processing. This is done by defining groups of characters and capturing them using the special parentheses ( and ) metacharacters. Any subpattern inside a pair of parentheses will be captured as a group.


2 Answers

Named captures are suitable only for one matching result.
Ruby's analogue of findall is String#scan. You can either use scan result as an array, or pass a block to it:

irb> s = "123--abc,123--abc,123--abc" => "123--abc,123--abc,123--abc"  irb> s.scan(/(\d*)--([a-z]*)/) => [["123", "abc"], ["123", "abc"], ["123", "abc"]]  irb> s.scan(/(\d*)--([a-z]*)/) do |number, chars| irb*     p [number,chars] irb> end ["123", "abc"] ["123", "abc"] ["123", "abc"] => "123--abc,123--abc,123--abc" 
like image 85
Nakilon Avatar answered Sep 23 '22 04:09

Nakilon


Chiming in super-late, but here's a simple way of replicating String#scan but getting the matchdata instead:

matches = [] foo.scan(regex){ matches << $~ } 

matches now contains the MatchData objects that correspond to scanning the string.

like image 28
marcus erronius Avatar answered Sep 26 '22 04:09

marcus erronius