Sorry if this a dumb question but it's been driving me mental for the past 5 days.
I'm trying to make a regex pattern to match the Irish car registration example '12-W-1234
'
So far this is what I have:
import java.util.ArrayList;
import java.util.List;
public class ValidateDemo {
public static void main(String[] args) {
List<String> input = new ArrayList<String>();
input.add("12-WW-1");
input.add("12-W-223");
input.add("02-WX-431");
input.add("98-zd-4134");
input.add("99-c-7465");
for (String car : input) {
if (car.matches("^(\\d{2}-?\\w*([KK|kk|ww|WW|c|C|ce|CE|cn|CN|cw|CW|d|D|dl|DL|g|G|ke|KE|ky|KY|l|L|ld|LD|lh|LH|lk|LK|lm|LM|ls|LS|mh|MH|mn|MN|mo|MO|oy|OY|so|SO|rn|RN|tn|TN|ts|TS|w|W|wd|WD|wh|WH|wx|WX])-?\\d{1,4})$")) {
System.out.println("Car Template " + car);
}
}
}
}
My problems are coming up when it is checking regs that would have a single letter in the that is in my pattern. Eg '12-ZD-1234'
.
Where ZD
isn't a valid county ID but since D
is valid it allows it to be displayed.
Any help would be great.
I've already done research on a few websites including this and this.
These websites helped, but I'm still having my problems.
By the by, I'am going to change the pattern to change all inputs into uppercase to reduce the size of my code. Thanks for the help
Besides the \\w*
that others have pointed out, you're misusing character classes ([...]
). To actually use alternation (|
), take out the square brackets as well:
^(\\d{2}-?(KK|kk|ww|WW|c|C|ce|CE|cn|CN|cw|CW|d|D|dl|DL|g|G|ke|KE|ky|KY|l|L|ld|LD|lh|LH|lk|LK|lm|LM|ls|LS|mh|MH|mn|MN|mo|MO|oy|OY|so|SO|rn|RN|tn|TN|ts|TS|w|W|wd|WD|wh|WH|wx|WX)-?\\d{1,4})$
Here are some examples to show you how character classes actually work:
[abc]
matches a single character, either a
, b
, or c
.[aabbcc]
is equivalent to [abc]
(duplicates are disregarded).[|]
matches a pipe character, i.e. symbols are allowed.[KK|kk|ww|WW|c|C|ce|CE ... ]
ends up being equivalent to [K|wWcCeE ... ]
because, again, duplicates are disregarded.You were correct to use the alternation operator (|
) to do what you desired, but you didn't need to use character classes.
You can improve you pattern like this:
^[0-9]{2}-?(?>c[enw]?|C[ENW]?|dl?|DL?|g|G|k[eky]|K[EKY]|l[dhkms]?|L[DHKMS]?|m[hno]|M[HNO]|oy|OY|rn|RN|so|SO|t[ns]|T[NS]|w[dhx]?|W[DHX]?)-?[0-9]{1,4}$
And if you don't care about the case of letters:
^(?i)[0-9]{2}-?(?>c[enw]?|dl?|g|k[eky]|l[dhkms]?|m[hno]oy|rn|so|t[ns]|w[dhx]?)-?[0-9]{1,4}$
Note that anchors (^
and $
) are useful if your string must only contain the car registration number.
Note2: You can improve it more, if you put at the first place in the alternation the most frequent county.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With