Getting multiple matches within a string using regex in Perl

Question

After having read this similar question and having tried my code several times, I keep on getting the same undesired output.

Let's assume the string I'm searching is "I saw wilma yesterday". The regex should capture each word followed by an 'a' and its optional 5 following characters or spaces.

The code I wrote is the following:

$_ = "I saw wilma yesterday";

if (@m = /(\w+)a(.{5,})?/g){
    print "found " . @m . " matches
";

    foreach(@m){
        print "	\"$_\"
";
    }
}

However, I kept on getting the following output:

found 2 matches
    "s"
    "w wilma yesterday"

while I expected to get the following one:

found 3 matches:
    "saw wil"
    "wilma yest"
    "yesterday"

until I found out that the return values inside @m were $1 and $2, as you can notice.

Now, since the /g flag is on, and I don't think the problem is about the regex, how could I get the desired output?

Casimir et Hippolyte · Accepted Answer

You can try this pattern that allows overlapped results:

(?=\b(\w+a.{1,5}))

or

(?=(?i)\b([a-z]+a.{0,5}))

example:

use strict;
my $str = "I saw wilma yesterday";
my @matches = ($str =~ /(?=\b([a-z]+a.{0,5}))/gi);
print join("
", @matches),"
";

more explanations:

You can't have overlapped results with a regex since when a character is "eaten" by the regex engine it can't be eaten a second time. The trick to avoid this constraint, is to use a lookahead (that is a tool that only checks, but not matches) which can run through the string several times, and put a capturing group inside.

For another example of this behaviour, you can try the example code without the word boundary (\b) to see the result.

PP. · Answer

Firstly you want to capture everything inside the expression, i.e.:

/(\w+a(?:.{5,})?)/

Next you want to start your search from one character past where the last expression's first character matched.

The pos() function allows you to specify where a /g regex starts its search from.

Getting multiple matches within a string using regex in Perl

Tags:

regex

perl

multiple-matches

Acsor

2 Answers

Casimir et Hippolyte

PP.

Recent Activity

Donate For Us

Getting multiple matches within a string using regex in Perl

Tags:

regex

perl

multiple-matches

Acsor

2 Answers

Casimir et Hippolyte

PP.

Related questions

Recent Activity

Donate For Us