I have a string of arbitrary length, and starting at position p0, I need to find the first occurrence of one of three 3-letter patterns.
Assume the string contain only letters. I need to find the count of triplets starting at position p0 and jumping forward in triplets until the first occurrence of either 'aaa' or 'bbb' or 'ccc'.
Is this even possible using just a regex?
Moritz says this might be faster than a regex. Even if it's a little slower, it's easier to understand at 5 am. :)
#0123456789.123456789.123456789. my $string = "alsdhfaaasccclaaaagalkfgblkgbklfs"; my $pos = 9; my $length = 3; my $regex = qr/^(aaa|bbb|ccc)/; while( $pos < length $string ) { print "Checking $pos\n"; if( substr( $string, $pos, $length ) =~ /$regex/ ) { print "Found $1 at $pos\n"; last; } $pos += $length; }
$string=~/^ # from the start of the string
(?:.{$p0}) # skip (don't capture) "$p0" occurrences of any character
(?:...)*? # skip 3 characters at a time,
# as few times as possible (non-greedy)
(aaa|bbb|ccc) # capture aaa or bbb or ccc as $1
/x;
(Assuming p0 is 0-based).
Of course, it's probably more efficient to use substr on the string to skip forward:
substr($string, $p0)=~/^(?:...)*?(aaa|bbb|ccc)/;
You can't really count with regexes, but you can do something like this:
pos $string = $start_from;
$string =~ m/\G # anchor to previous pos()
((?:...)*?) # capture everything up to the match
(aaa|bbb|ccc)
/xs or die "No match"
my $result = length($1) / 3;
But I think it's a bit faster to use substr() and unpack() to split into triple and walk the triples in a for-loop.
(edit: it's length(), not lenght() ;-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With