For Example -
text = Československá obchodní banka;
text string contains diacritics like Č , á etc.
I want to write a function where i will pass this string "Československá obchodní banka" and function will return true if string contains diacritics else false
.
I have to handle diacritics and string which contains character which doesn't fall in A-z or a-z range separately.
1) If String contains diacritics then I have to do some XXXXXX on it.
2) If String contains character other than A-Z or a-z and not contains diacritics then do some other operations YYYYY.
I have no idea how to do it.
One piece of knowledge: in Unicode there exists a code for á
but the same result one may get with an a
and a combining mark-'
.
You can use java.text.Normalizer, as follows:
public static boolean hasDiacritics(String s) {
// Decompose any á into a and combining-'.
String s2 = Normalizer.normalize(s, Normalizer.Form.NFD);
return s2.matches("(?s).*\\p{InCombiningDiacriticalMarks}.*");
//return !s2.equals(s);
}
The Normalizer class seems to be able to accomplish this. Some limited testing indicate that
Normalizer.isNormalized(text, Normalizer.Form.NFD)
might be what you need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With