Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match exactly n occurrences of letters and m occurrences of digits

Tags:

java

regex

I have to match an 8 character string, which can contain exactly 2 letters (1 uppercase and 1 lowercase), and exactly 6 digits, but they can be permutated arbitrarily.

So, basically:

  • K82v6686 would pass
  • 3w28E020 would pass
  • 1276eQ900 would fail (too long)
  • 98Y78k9k would fail (three letters)
  • A09B2197 would fail (two capital letters)

I've tried using the positive lookahead to make sure that the string contains digits, uppercase and lowercase letters, but I have trouble with limiting it to a certain number of occurrences. I suppose I could go about it by including all possible combinations of where the letters and digits can occur:

(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z]) ([A-Z][a-z][0-9]{6})|([A-Z][0-9][a-z][0-9]{5})| ... | ([0-9]{6}[a-z][A-Z])

But that's a very roundabout way of doing it, and I'm wondering if there's a better solution.

like image 803
NoelAramis Avatar asked Nov 18 '15 10:11

NoelAramis


People also ask

Which regex matches one or more digits?

+: one or more ( 1+ ), e.g., [0-9]+ matches one or more digits such as '123' , '000' . *: zero or more ( 0+ ), e.g., [0-9]* matches zero or more digits. It accepts all those in [0-9]+ plus the empty string.

What is the meaning of +$ in regex?

The usual context of wildcard characters is in globbing similar names in a list of files, whereas regexes are usually employed in applications that pattern-match text strings in general. For example, the regex ^[ \t]+|[ \t]+$ matches excess whitespace at the beginning or end of a line.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.

How do I allow only letters and numbers in regex?

You can use regular expressions to achieve this task. In order to verify that the string only contains letters, numbers, underscores and dashes, we can use the following regex: "^[A-Za-z0-9_-]*$".


2 Answers

Sort the string lexically and then match against ^(?:[a-z][A-Z]|[A-Z][a-z])[0-9]{6}$.

like image 64
hakamadare Avatar answered Nov 02 '22 17:11

hakamadare


You can use

^(?=[^A-Z]*[A-Z][^A-Z]*$)(?=[^a-z]*[a-z][^a-z]*$)(?=(?:\D*\d){6}\D*$)[a-zA-Z0-9]{8}$

See the regex demo (a bit modified due to the multiline input). In Java, do not forget to use double backslashes (e.g. \\d to match a digit).

Here is a breakdown:

  • ^ - start of string (assuming no multiline flag is to be used)
  • (?=[^A-Z]*[A-Z][^A-Z]*$) - check if there is only 1 uppercase letter (use \p{Lu} to match any Unicode uppercase letter and \P{Lu} to match any character other than that)
  • (?=[^a-z]*[a-z][^a-z]*$) - similar check if there is only 1 lowercase letter (alternatively, use \p{Ll} and \P{Ll} to match Unicode letters)
  • (?=(?:\D*\d){6}\D*$) - check if there are six digits in a string (=from the beginning of the string, there can be 0 or more non-digit symbols (\D matches any character but a digit, you may also replace it with [^0-9]), then followed by a digit (\d) and then followed by 0 or more non-digit characters (\D*) up to the end of string ($)) and then
  • [a-zA-Z0-9]{8} - match exactly 8 alphanumeric characters.
  • $ - end of string.

Following the logic, we can even reduce this to just

^(?=[^a-z]*[a-z][^a-z]*$)(?=(?:\D*\d){6}\D*$)[a-zA-Z0-9]{8}$

One condition can be removed as we only allow lower- and uppercase letters and digits with [a-zA-Z0-9], and when we apply 2 conditions the 3rd one is automatically performed when matching the string (one character must be an uppercase in this case).

When using it with Java matches() method, there is no need to use ^ and $ anchors at the start and end of the pattern, but you still need it in the lookaheads:

String s = "K82v6686";
String rx = "(?=[^a-z]*[a-z][^a-z]*$)" +      // 1 lowercase letter check
            "(?=(?:\\D*\\d){6}\\D*$)" +       // 6 digits check
            "[a-zA-Z0-9]{8}";                 // matching 8 alphanum chars exactly
if (s.matches(rx)) {
    System.out.println("Valid"); 
} 
like image 24
Wiktor Stribiżew Avatar answered Nov 02 '22 17:11

Wiktor Stribiżew