Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using regular expressions to compare numbers

Tags:

regex

So this is an obvious case of "you're doing it wrong". I don't actually intend on doing this, but a conversation at work spurred this question:

Can you generate a regular expression to determine if an integer is less than an arbitrary value.

For some values this is easy. For integers less than 1000, \d{1,3} should do the trick. For integers < 500, it's a bit trickier, but not that bad, as you can use [0-4]{0,1}\d{1,2}.

Once you get to arbitrary values it gets a lot tricker. For example, all numbers less than 255 would be something like \d{1,2} | [0-1]\d{2}|[2][0-4]\d | [2][5][0-4].

Is there a single regular expression that works here? Or do you have to programatically generate the regex?

(And again, let me point out that I have no intention of actually doing this. Obviously using "foo < bar" in your favorite programming language is far more efficient and easy to read.)

like image 838
Matt31415 Avatar asked Feb 24 '12 16:02

Matt31415


People also ask

How do you compare regular expressions?

The regular expression comparison is performed on the string representation of the left side of the comparison. That is, if the left side is an integer, the regular expression will behave is if the value 0 was the literal string "0" .

How do I match a number in regex?

How do I match a number in regex? To match any number from 0 to 9 we use \d in regex. It will match any single digit number from 0 to 9. \d means [0-9] or match any number from 0 to 9.

Can you use regex with numbers?

Since regular expressions work with text, a regular expression engine treats 0 as a single character, and 255 as three characters. To match all characters from 0 to 255, we'll need a regex that matches between one and three characters. The regex [0-9] matches single-digit numbers 0 to 9.

How do you check if a number is regular expression?

To check for all numbers in a field To get a string contains only numbers (0-9) we use a regular expression (/^[0-9]+$/) which allows only numbers. Next, the match() method of the string object is used to match the said regular expression against the input value.


2 Answers

This is quite easy.

#!/usr/bin/env perl
use strict;
use warnings;
use Regexp::Assemble;

for my $n (@ARGV)  {
    my $asm = new Regexp::Assemble;
    for (1 .. $n) { $asm->add($_) }
    for ($asm->re){
        s/\)$/\$/;
        s/^[^:]*:/^/;
        print "$n => /$_/\n";
    }
}

Now run it to find the pattern that matches integers between 1 and that number:

$ perl /tmp/ra 5 15 153 401 1144
5 => /^[12345]$/
15 => /^(?:[23456789]|1[012345]?)$/
153 => /^(?:1(?:[6789]|5[0123]?|0\d?|1\d?|2\d?|3\d?|4\d?)?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)$/
401 => /^(?:1(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|2(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|3(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|4(?:[123456789]|0[01]?)?|5\d?|6\d?|7\d?|8\d?|9\d?)$/
1144 => /^(?:1(?:0(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|1(?:[56789]|4[01234]?|0\d?|1\d?|2\d?|3\d?)?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|2(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|3(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|4(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|5(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|6(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|7(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|8(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?|9(?:0\d?|1\d?|2\d?|3\d?|4\d?|5\d?|6\d?|7\d?|8\d?|9\d?)?)$/
like image 81
tchrist Avatar answered Oct 16 '22 00:10

tchrist


You're going to need to generate the expression for each bounding number. Let's say there were a regular expression that would do the job. Then that regular expression would have to be able to take as input some sequence of characters. However, we know that regular expressions and finite state automata are equivalent, so this is the same as saying we can construct an FSM since the possible number is unbounded, that would require an unbounded number of states, which contradicts the definition of FSA.

like image 24
Charlie Martin Avatar answered Oct 16 '22 00:10

Charlie Martin