Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect difference of match in list context with a capturing vs non-capturing regexp?

Tags:

regex

perl

According to perlretut

... in scalar context, $time =~ /(\d\d):(\d\d):(\d\d)/ returns a true or false value. In list context, however, it returns the list of matched values ($1,$2,$3) .

But I can't find an explanation of what is returned in list context if the pattern matches when there are no capturing groups in the regexp. Testing shows that it is the list (1) (single element, integer 1). (Ancillary question - will it always be this, where is it defined?)

This makes it difficult to do what I want:

if (my @captures = ($input =~ $regexp)) {
    furtherProcessing(@captures);
}

I want furtherProcessing to be called if there is a match, with any captured groups passed as arguments. The problem comes when the $regexp contains no capturing groups because then I want furtherProcessing to be called with no arguments, not with the value 1 which is what happens in the above. I can't test for (1) as a special case, like this

if (my @captures = ($input =~ $regexp)) {
    shift @captures if $captures[0] == 1;
    furtherProcessing(@captures);
}

because in the case of

$input = 'a value:1';
$regexp = qr/value:(\S+)/;

there is a captured value in @captures that happens to look the same as what I get when the $regexp matches but has no capturing groups.

Is there a way to do what I want?

like image 421
Day Avatar asked May 21 '11 22:05

Day


People also ask

What is the point of non-capturing group in regex?

Non-capturing groups are important constructs within Java Regular Expressions. They create a sub-pattern that functions as a single unit but does not save the matched character sequence.

What does capturing mean regex?

capturing in regexps means indicating that you're interested not only in matching (which is finding strings of characters that match your regular expression), but you're also interested in using specific parts of the matched string later on.

Why use non-capturing group?

the reason for using the non-capturing group is to save memory, as the regex engine doesn't need to store the groups in the buffer.


1 Answers

You can use $#+ to find out how many groups were in the last successful match. If that's 0, then there were no groups and you have (1). (Yes, it will always be (1) if there are no groups, as documented in perlop.)

So, this will do what you want:

if (my @captures = ($input =~ $regexp)) {
    @captures = () unless $#+; # Only want actual capture groups
    furtherProcessing(@captures);
}

Note that $#+ counts all groups, whether they matched or not (as long as the entire RE matched). So, "hello" =~ /hello( world)?/ will return 1 group, even though the group didn't match (the value in @captures will be undef).

like image 152
cjm Avatar answered Oct 14 '22 03:10

cjm