The built-in variables @-
and @+
hold the start and end positions, respectively, of the last successful match. $-[0]
and $+[0]
correspond to entire pattern, while $-[N]
and $+[N]
correspond to the $N
($1
, $2
, etc.) submatches.
Forget my previous post, I've got a better idea.
sub match_positions {
my ($regex, $string) = @_;
return if not $string =~ /$regex/;
return ($-[0], $+[0]);
}
sub match_all_positions {
my ($regex, $string) = @_;
my @ret;
while ($string =~ /$regex/g) {
push @ret, [ $-[0], $+[0] ];
}
return @ret
}
This technique doesn't change the regex in any way.
Edited to add: to quote from perlvar on $1..$9. "These variables are all read-only and dynamically scoped to the current BLOCK." In other words, if you want to use $1..$9, you cannot use a subroutine to do the matching.
The pos function gives you the position of the match. If you put your regex in parentheses you can get the length (and thus the end) using length $1
. Like this
sub match_positions {
my ($regex, $string) = @_;
return if not $string =~ /($regex)/;
return (pos($string) - length $1, pos($string));
}
sub all_match_positions {
my ($regex, $string) = @_;
my @ret;
while ($string =~ /($regex)/g) {
push @ret, [pos($string) - length $1, pos($string)];
}
return @ret
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With