I have a list with some Belgian cities with diacritic characters: (Liège, Quiévrain, Franière, etc.) and I would like to transform these special characters to compare with a list containing the same names in upper case, but without the diacritical marks (LIEGE, QUIEVRAIN, FRANIERE)
What i first tried to do was to use the upper case:
LIEGE.contentEqual(Liège.toUpperCase())
but that doesn't fit because the Upper case of Liège
is LIÈGE
and not LIEGE
.
I have some complicated ideas like replacing each character, but that sounds stupid and a long process.
Any ideas on how to do this in a smart way?
As of Java 6, you can use java.text.Normalizer:
public String unaccent(String s) {
String normalized = Normalizer.normalize(s, Normalizer.Form.NFD);
return normalized.replaceAll("[^\\p{ASCII}]", "");
}
Note that in Java 5 there is also a sun.text.Normalizer
, but its use is strongly discouraged since it's part of Sun's proprietary API and has been removed in Java 6.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With