Perl: "Quantifier in {,} bigger than 32766 in regex"

Question

Let's say I want to find in a large (300,000 letters) the word "dogs" with the distance between letters exactly 40,000 letters in between. So I do:

$mystring =~ m/d.{40000}o.{40000}g.{40000}s/;

This will work quite well in other (slower) languages but in Perl it throws me "Quantifier in {,} bigger than 32766 in regex".

So:

Can we use a bigger number as the quantifier somehow?
If not, is there another good way to find what I want? Note that "dogs" is only an example; I want to do this for any word and any jump size (and fast).

Ben Jackson · Accepted Answer

If you really need to do this fast I would look at a custom search based on the ideas of Boyer-Moore string search. A regular expression is parsed into a finite state machine. Even a clever, compact representation of such a FSM is not going to be a very effective way to execute a search like you describe.

If you really want to continue along the lines you are now you can just concatenate two expressions like .{30000}.{10000} which is the same as .{40000} in practice.

Sinan Ünür · Answer

I think index might be better suited for this task. Something along the lines of the completely untested:

sub has_dogs {
    my $str = shift;
    my $start = 0

    while (-1 < (my $pos = index $$str, 'd', $start)) {
        no warnings 'uninitialized';
        if ( ('o' eq substr($$str, $pos +  40_000, 1)) and
             ('g' eq substr($$str, $pos +  80_000, 1)) and
             ('s' eq substr($$str, $pos + 120_000, 1)) ) {
             return 1;
         }
     }
     return;
 }

Perl: "Quantifier in {,} bigger than 32766 in regex"

Tags:

regex

perl

Gadi A

2 Answers

Ben Jackson

Sinan Ünür

Recent Activity

Donate For Us

Perl: "Quantifier in {,} bigger than 32766 in regex"

Tags:

regex

perl

Gadi A

2 Answers

Ben Jackson

Sinan Ünür

Related questions

Recent Activity

Donate For Us