matching finite natural number series

Question

How can I match finite natural number series with regex?

So, requirements are:

string contains numbers and spaces (as delimiters)
first number is 1
each number (except the first one) is equal to previous number + 1

Should be matched:

1
1 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
long series of consequent numbers from 1 to 10^1000

Should not be matched:

``
1 3 4
1 2 3 4 5 6 6

Beside that there are some requirements to regex:

it should be single one-shot expression, and not a cycle-condition-algorithm bale of instructions
it could use all power of perl regular expressions

I'm not sure that regex are actually lazy, so it would be great if they are. Because the natural number series is non-finite in it's original meaning from number theory.

And the last one. Please notice, that I'm not using the wrong tool for that job. It's not a real world programming task at all.

tchrist · Accepted Answer

Here you go. Tested on Perl v5.10 through v5.14. The key is the recursive pattern, where we recurse on the (?&Sequence) rule. It’s something of a proof by induction.

The bigint is there just in case you really want to generate a sequence from 1 .. 10**10_000. It will run considerably faster if you can limit yourself to machine native ints, 32-bit or 64-bit depending on your platform.

#!/usr/bin/env perl
use v5.10;
use bigint;  # only if you need stuff over maxint

my $pat = qr{
    ^
    (?= 1 \b )
    (?<Sequence>
        (?<Number> \d+ )
        (?:
            \s+
            (??{  "(?=" . (1 + $+{Number}) . ")" })
            (?&Sequence)
        )?
    )
    $
}x;

# first test embedded data
while (<DATA>) {
    if ( /$pat/ ) {
        print "PASS: ", $_;

    } else {
        print "FAIL: ", $_;
    }
}

# now generate long sequences
for my $big ( 2, 10, 25, 100, 1000, 10_000, 100_000 ) {
    my $str = q();
    for (my $i = 1; $i <= $big; $i++) {
        $str .= "$i ";
    }
    chop $str;
    if ($str =~ $pat) {
        print "PASS: ";
    } else {
        print "FAIL: ";
    }
    if (length($str) > 60) {
        my $len = length($str);
        my $first = substr($str,   0, 10);
        my $last  = substr($str, -10);
        $str = $first . "[$len chars]" . $last;
    }
    say $str;

}


__END__
5
fred
1
1 2 3
1 3 2
1 2 3 4 5
1 2 3 4 6
2 3 4 6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1 2 3 4 5 6 6

Which run produces:

FAIL: 5
FAIL: fred
PASS: 1
PASS: 1 2 3
FAIL: 1 3 2
PASS: 1 2 3 4 5
FAIL: 1 2 3 4 6
FAIL: 2 3 4 6
PASS: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
FAIL: 1 2 3 4 5 6 6
PASS: 1 2
PASS: 1 2 3 4 5 6 7 8 9 10
PASS: 1 2 3 4 5 [65 chars]2 23 24 25
PASS: 1 2 3 4 5 [291 chars] 98 99 100
PASS: 1 2 3 4 5 [3892 chars]8 999 1000
PASS: 1 2 3 4 5 [588894 chars]999 100000

At the risk of seeming self-serving, there is a book that covers this sort of thing. See the section on “Fancy Patterns” in Chapter 5 of Programming Perl, 4ᵗʰ edition. You’ll want to check out the new sections on “Named Groups”, on “Recursive Patterns”, and on “Grammatical Patterns”. The book is at the printers, and should be available electronically in a day or two.

matching finite natural number series

Tags:

regex

perl

ДМИТРИЙ МАЛИКОВ

1 Answers

tchrist

Recent Activity

Donate For Us

matching finite natural number series

Tags:

regex

perl

ДМИТРИЙ МАЛИКОВ

1 Answers

tchrist

Related questions

Recent Activity

Donate For Us