I'd like to use Hibernate Validator to validate some columns. The problem, as I understand, is that the \w marker in java doesn't accept letters with accents on them.
Is there any way that I could write the regexp so that words like Relatório could be validated (i wouldn't want to write all letters with accents between brackets, because I expect to be writing this regexp in a lot of columns)?
Therefore, the regular expression \s matches a single whitespace character, while \s+ will match one or more whitespace characters.
replaceAll("[^\\p{ASCII}]", ""); If your text is in unicode, you should use this instead: string = string. replaceAll("\\p{M}", "");
Backslashes in Java. The backslash \ is an escape character in Java Strings. That means backslash has a predefined meaning in Java. You have to use double backslash \\ to define a single backslash. If you want to define \w , then you must be using \\w in your regex.
The Java regex documentation has a section on Unicode categories (search for "Classes for Unicode blocks and categories"). If you're just looking for letters, I think \p{L}
is the category you want.
I had more luck with:
\p{InCombiningDiacriticalMarks}+
In java I use the following method:
import java.text.Normalizer;
import java.text.Normalizer.Form;
public static String removeAccents(String text) {
return text == null ? null :
Normalizer.normalize(text, Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With