I want to search a file for a string and then get offsets for all the matches. The content of file is as below:
sometext
sometext
AAA
sometext
AAA
AAA
sometext
I am reading this whole file into a string $text
and then doing a regex match for AAA
as follows:
if($text =~ m/AAA/g) {
$offset = $-[0];
}
This will give offset of only one AAA
. How can I get offset of all the matches?
I know that we can get all matches in an array using syntax like this:
my @matches = ($text =~ m/AAA/g);
But I want offset not matched string.
Currently I am using following code to get offsets of all matches:
my $text= "sometextAAAsometextAAA";
my $regex = 'AAA';
my @matches = ();
while ($text =~ /($regex)/gi){
my $match = $1;
my $length = length($&);
my $pos = length($`);
my $start = $pos + 1;
my $end = $pos + $length;
my $hitpos = "$start-$end";
push @matches, "$match found at $hitpos ";
}
print "$_\n" foreach @matches;
But is there a simpler way to to this?
You already know that you should use $-[0]
! Replace
while ($text =~ /($regex)/gi){
my $match = $1;
my $length = length($&);
my $pos = length($`);
my $start = $pos + 1;
my $end = $pos + $length;
my $hitpos = "$start-$end";
push @matches, "$match found at $hitpos ";
}
with
while ($text =~ /($regex)/gi){
push @matches, "$1 found at $-[0]";
}
That said, I'm a big fan of separating calculations from output formatting, so I would do
while ($text =~ /($regex)/gi){
push @matches, [ $1, $-[0] ];
}
PS — Unless you've unrolled a while loop, if (/.../g)
makes no sense. At best, the /g
does nothing. At worse, you get incorrect results.
I don't think there's a built-in way to do this in Perl. But from How can I find the location of a regex match in Perl?:
sub match_all_positions {
my ($regex, $string) = @_;
my @ret;
while ($string =~ /$regex/g) {
push @ret, [ $-[0], $+[0] ];
}
return @ret
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With