I have the following reg expression that works fine when the user's inputs English. But it always fails when using Portuguese characters.
Pattern p = Pattern.compile("^[a-zA-Z]*$");
Matcher matcher = p.matcher(fieldName);
if (!matcher.matches())
{
....
}
Is there any way to get the pattern object to recognise valid Portuguese characters such as ÁÂÃÀÇÉÊÍÓÔÕÚç....?
Thanks
You want a regular expression that will match the class of all alphabetic letters. Across all the scripts of the world, there's loads of those, but luckily we can tell Java 6's RE engine that we're after a letter and it will use the magic of Unicode classes to do the rest. In particular, the L
class matches all types of letters, upper, lower and “oh, that concept doesn't apply in my language”:
Pattern p = Pattern.compile("^\\p{L}*$");
// the rest is identical, so won't repeat it...
When reading the docs, remember that backslashes will need to be doubled up if placed in a Java literal so as to stop the Java compiler from interpreting them as something else. (Also be aware that that RE is not suitable for things like validating the names of people, which is an entirely different and much more difficult problem.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With