How can I match finite natural number series with regex?
So, requirements are:
1
Should be matched:
1
1 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Should not be matched:
1 3 4
1 2 3 4 5 6 6
Beside that there are some requirements to regex:
perl
regular expressions I'm not sure that regex are actually lazy, so it would be great if they are. Because the natural number series is non-finite in it's original meaning from number theory.
And the last one. Please notice, that I'm not using the wrong tool for that job. It's not a real world programming task at all.
Here you go. Tested on Perl v5.10 through v5.14. The key is the recursive pattern, where we recurse on the (?&Sequence)
rule. It’s something of a proof by induction.
The bigint
is there just in case you really want to generate a sequence from 1 .. 10**10_000
. It will run considerably faster if you can limit yourself to machine native ints, 32-bit or 64-bit depending on your platform.
#!/usr/bin/env perl
use v5.10;
use bigint; # only if you need stuff over maxint
my $pat = qr{
^
(?= 1 \b )
(?<Sequence>
(?<Number> \d+ )
(?:
\s+
(??{ "(?=" . (1 + $+{Number}) . ")" })
(?&Sequence)
)?
)
$
}x;
# first test embedded data
while (<DATA>) {
if ( /$pat/ ) {
print "PASS: ", $_;
} else {
print "FAIL: ", $_;
}
}
# now generate long sequences
for my $big ( 2, 10, 25, 100, 1000, 10_000, 100_000 ) {
my $str = q();
for (my $i = 1; $i <= $big; $i++) {
$str .= "$i ";
}
chop $str;
if ($str =~ $pat) {
print "PASS: ";
} else {
print "FAIL: ";
}
if (length($str) > 60) {
my $len = length($str);
my $first = substr($str, 0, 10);
my $last = substr($str, -10);
$str = $first . "[$len chars]" . $last;
}
say $str;
}
__END__
5
fred
1
1 2 3
1 3 2
1 2 3 4 5
1 2 3 4 6
2 3 4 6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1 2 3 4 5 6 6
Which run produces:
FAIL: 5
FAIL: fred
PASS: 1
PASS: 1 2 3
FAIL: 1 3 2
PASS: 1 2 3 4 5
FAIL: 1 2 3 4 6
FAIL: 2 3 4 6
PASS: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
FAIL: 1 2 3 4 5 6 6
PASS: 1 2
PASS: 1 2 3 4 5 6 7 8 9 10
PASS: 1 2 3 4 5 [65 chars]2 23 24 25
PASS: 1 2 3 4 5 [291 chars] 98 99 100
PASS: 1 2 3 4 5 [3892 chars]8 999 1000
PASS: 1 2 3 4 5 [588894 chars]999 100000
At the risk of seeming self-serving, there is a book that covers this sort of thing. See the section on “Fancy Patterns” in Chapter 5 of Programming Perl, 4ᵗʰ edition. You’ll want to check out the new sections on “Named Groups”, on “Recursive Patterns”, and on “Grammatical Patterns”. The book is at the printers, and should be available electronically in a day or two.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With