Ignoring diacritic characters when comparing words with special characters (é, è, ...)

Question

I have a list with some Belgian cities with diacritic characters: (Liège, Quiévrain, Franière, etc.) and I would like to transform these special characters to compare with a list containing the same names in upper case, but without the diacritical marks (LIEGE, QUIEVRAIN, FRANIERE)

What i first tried to do was to use the upper case:

LIEGE.contentEqual(Liège.toUpperCase()) but that doesn't fit because the Upper case of Liège is LIÈGE and not LIEGE.

I have some complicated ideas like replacing each character, but that sounds stupid and a long process.

Any ideas on how to do this in a smart way?

Stijn Van Bael · Accepted Answer

As of Java 6, you can use java.text.Normalizer:

public String unaccent(String s) {
    String normalized = Normalizer.normalize(s, Normalizer.Form.NFD);
    return normalized.replaceAll("[^\p{ASCII}]", "");
}

Note that in Java 5 there is also a sun.text.Normalizer, but its use is strongly discouraged since it's part of Sun's proprietary API and has been removed in Java 6.

Ignoring diacritic characters when comparing words with special characters (é, è, ...)

Tags:

java

string

android

replace

diacritics

Waza_Be

1 Answers

Stijn Van Bael

Recent Activity

Donate For Us

Ignoring diacritic characters when comparing words with special characters (é, è, ...)

Tags:

java

string

android

replace

diacritics

Waza_Be

1 Answers

Stijn Van Bael

Related questions

Recent Activity

Donate For Us