Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching two overlapping patterns with Perl

I hope that my question has not already been posed by someone else, since I tried to look almost everywhere in the site but I couldn't manage to find an answer.

My problem is: I'm making a PERL script which has to detect the position of every occurrence of one or another pattern in a string.

For instance:

$string = "betaalphabetabeta";
$pattern = "beta|alpha";

In this case, I would like my script to return 4 matches.

I thought that this could be easily achieved by using the match operator in someway like this:

$string =~ /beta|alpha/g;

However, since my two patterns ("alpha", "beta") are partially overlapping, the piece of code that I've just posted skips any occurrence of the first pattern when it overlaps with the second one.

E.g. if I have a string like this one:

$string = "betalphabetabeta";

it only returns 3 matches instead of 4.

I've tried to do something with the ?= operator, but I can't manage to couple it with the OR operator in a correct way...

Does anyone have any solution? Thanks for your help!

like image 694
selenocysteine Avatar asked Jan 10 '13 14:01

selenocysteine


2 Answers

The following uses a zero-width assertion (I believe that's what it's called).

#!/usr/bin/perl
use strict;
use warnings;

$_ = "betalphabetabeta";

while (/(?=(alpha|beta))/g) {
    print $1, "\n"; 

Prints:

C:\Old_Data\perlp>perl t9.pl
beta
alpha
beta
beta
like image 75
Chris Charley Avatar answered Nov 14 '22 06:11

Chris Charley


You have to use looakahead and count the number of matches

(?=beta|alpha)

Not tested in perl but should work

works here

like image 36
Anirudha Avatar answered Nov 14 '22 07:11

Anirudha