Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

perl regex to capture repeating group

Tags:

regex

perl

I want a regular expression that matches something at the beginning of a line, and then matches (and returns) all other words. For instance, given this line:

$line = "one two three etc";

I want something like this (that doesn't work):

@matches= $line=~ /^one(?:\s+(\S+))$/;

to return into @matches, the words "two", "three", "etc".

I don't want to know how to get the words. I want to do it with a regular expression. It seems so simple, but I have not been able to come with a solution.

like image 915
agarrubio Avatar asked Sep 23 '14 03:09

agarrubio


3 Answers

To do that you need to use the \G anchor that matches the position at the end of the last match. When you build a pattern with this anchor, you can obtain contiguous results:

@matches = $line =~ /(?:\G(?!\A)|^one) (\S+)/g; 
like image 165
Casimir et Hippolyte Avatar answered Sep 19 '22 16:09

Casimir et Hippolyte


You cannot have an unknown number of capture groups. If you try to repeat a capturing group, the last instance will override the contents of the capture group:

  • Expression: ^one(?:\s+(\S+))+$
  • Capture #1: etc

Or:

  • Expression: ^one\s+(\S+)\s+(\S+)\s+(\S+)$
  • Capture #1: two
  • Capture #2: three
  • Capture #3: etc

I suggest either capturing the entire group and then splitting by spaces:

  • Expression: ^one\s+((?:\S+\s*)+)$
  • Capture #1: two three etc

Or you can do a global match and utilize \G and \K:

  • Expression: (?:^one|(?<!\A)\G).*?\K\S+
  • Match #1: two
  • Match #2: three
  • Match #3: etc
like image 29
Sam Avatar answered Sep 20 '22 16:09

Sam


^.*?\s\K|(\w+)

Try this.See demo.

http://regex101.com/r/lS5tT3/2

like image 36
vks Avatar answered Sep 17 '22 16:09

vks