Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i get know that my String contains diacritics?

For Example -

text = Československá obchodní banka;

text string contains diacritics like Č , á etc.

I want to write a function where i will pass this string "Československá obchodní banka" and function will return true if string contains diacritics else false.

I have to handle diacritics and string which contains character which doesn't fall in A-z or a-z range separately.

1) If String contains diacritics then I have to do some XXXXXX on it.

2) If String contains character other than A-Z or a-z and not contains diacritics  then do some other operations YYYYY.

I have no idea how to do it.

like image 226
Pramod Kumar Avatar asked Jul 03 '12 10:07

Pramod Kumar


2 Answers

One piece of knowledge: in Unicode there exists a code for á but the same result one may get with an a and a combining mark-'.

You can use java.text.Normalizer, as follows:

public static boolean hasDiacritics(String s) {
    // Decompose any á into a and combining-'.
    String s2 = Normalizer.normalize(s, Normalizer.Form.NFD);
    return s2.matches("(?s).*\\p{InCombiningDiacriticalMarks}.*");
    //return !s2.equals(s);
}
like image 105
Joop Eggen Avatar answered Nov 18 '22 16:11

Joop Eggen


The Normalizer class seems to be able to accomplish this. Some limited testing indicate that

Normalizer.isNormalized(text, Normalizer.Form.NFD)

might be what you need.

like image 36
Keppil Avatar answered Nov 18 '22 14:11

Keppil