Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Help with drivers license number validation regex

Tags:

c#

regex

I'm trying to validate a drivers license for a form that i am making. I was trying to use a single regex.

  1. Max length 9 characters
  2. Alphanumeric characters only
  3. Must have at least 4 numeric characters
  4. Must have no more than 2 alphabetic characters
  5. The third and fourth character must be numeric

I'm new at regex I'm googling trying to work it out. Help with this would be appreciated.

like image 805
North Avatar asked Jun 20 '09 10:06

North


2 Answers

Does it have to be a single regex? I'd keep things simple by keeping them separate:

static bool IsValid(string input)
{
    return Regex.IsMatch(input, @"^[A-Za-z0-9]{4,9}$") // length and alphanumeric
        && Regex.IsMatch(input, "^..[0-9]{2}") // 3rd+4th are numeric
        && Regex.IsMatch(input, "(.*[0-9]){4}") // at least 4 numeric
        && !Regex.IsMatch(input, "(.*[A-Za-z]){3}"); // no more than 2 alpha
}
like image 92
Marc Gravell Avatar answered Oct 19 '22 09:10

Marc Gravell


Trying to solve this with just one regex is probably a little hard as you need to keep track of multiple things. I'd suggest you try validating each of the properties separately (unless it makes sense to do otherwise).

For example you can verify the first and second properties easily by checking for a character class including all alphanumeric characters and a quantifier which tells that it should be present at most 9 times:

^[0-9a-zA-Z]{4,9}$

The anchors ^ and $ ensure that this will, in fact, match the entire string and not just a part of it. As Marc Gravell pointed out the string "aaaaa" will match the expression "a{3}" because it can match partially as well.

Checking the fifth property can be done similarly, although this time we don't care about the rest:

^..[0-9]{2}

Here the ^ character is an anchor for the start of the string, the dot (.) is a placeholder for an arbitrary character and we're checking for the third and fourth character being numeric again with a character class and a repetition quantifier.

Properties three and four are probably easiest validated by iterating through the string and keeping counters as you go along.

EDIT: Marc Gravell has a very nice solution for those two cases with regular expressions as well. Didn't think of those.

If you absolutely need to do this in one regular expression this will be a bit work (and probably neither faster nor more readable). Basically I'd start with enumerating all possible options such a string could look like. I am using a here as placeholder for an alphabetic characters and 1 as a placeholder for a number.

We need at least four characters (3) and the third and fourth are always fixed as numbers. For four-character strings this leaves us with only one option:

1111

Five-character strings may introduce a letter, though, with different placements:

a1111
1a111
1111a

and, as before, the all-numeric variant:

11111

Going on like this you can probably create special rules for each case (basically I'd divide this into "no letter", "one letter" and "two letters" and enumerate the different patterns for that. You can then string together all patterns with the pipe (|) character which is used as an alternative in regular expressions.

like image 28
Joey Avatar answered Oct 19 '22 08:10

Joey