I am designing one android application in English and Chinese both. I want to know whether the user type English text or Chinese text?. Is there any way to check this in android?
Optical character recognition (OCR) – Many apps and websites provide OCR features where you can scan or take pictures of the character(s) you want to look up. Google Docs has such a feature and there are others online you can easily find by searching for “Chinese” and “OCR”.
If you want to detect whether the input string contains Chinese-like character(s) (CJK), the following may help you:
public static boolean isCJK(String str){
int length = str.length();
for (int i = 0; i < length; i++){
char ch = str.charAt(i);
Character.UnicodeBlock block = Character.UnicodeBlock.of(ch);
if (Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS.equals(block)||
Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS.equals(block)||
Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A.equals(block)){
return true;
}
}
return false;
}
The accepted answer is either incomplete or outdated. Here are a few methods you can use to test if a character is a CJK Ideograph. My fuller answer is here.
It is better to use the codepoint rather than charAt
(as in the accepted answer) because many Chinese characters are in a higher code plane. Using charAt
will just give you one of the surrogate pairs rather than the actual Chinese character. So a better way to loop through a String is like this:
final int length = myString.length();
for (int offset = 0; offset < length; ) {
final int codepoint = Character.codePointAt(myString, offset);
// use codepoint here
offset += Character.charCount(codepoint);
}
And testing the codepoints can be done in one of the following ways.
private boolean isCJK(int codepoint) {
Character.UnicodeBlock block = Character.UnicodeBlock.of(codepoint);
return (Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS.equals(block)||
Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A.equals(block) ||
Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B.equals(block) ||
Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C.equals(block) || // api 19, remove these if supporting lower versions
Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D.equals(block) || // api 19
Character.UnicodeBlock.CJK_COMPATIBILITY.equals(block) ||
Character.UnicodeBlock.CJK_COMPATIBILITY_FORMS.equals(block) ||
Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS.equals(block) ||
Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT.equals(block) ||
Character.UnicodeBlock.CJK_RADICALS_SUPPLEMENT.equals(block) ||
Character.UnicodeBlock.CJK_STROKES.equals(block) || // api 19
Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION.equals(block) ||
Character.UnicodeBlock.ENCLOSED_CJK_LETTERS_AND_MONTHS.equals(block) ||
Character.UnicodeBlock.ENCLOSED_IDEOGRAPHIC_SUPPLEMENT.equals(block) || // api 19
Character.UnicodeBlock.KANGXI_RADICALS.equals(block) ||
Character.UnicodeBlock.IDEOGRAPHIC_DESCRIPTION_CHARACTERS.equals(block));
}
Or for API 19
private boolean isCJK(int codepoint) {
return Character.isIdeographic(codepoint);
}
Or for API 24
private boolean isCJK(int codepoint) {
return (Character.UnicodeScript.of(codepoint) == Character.UnicodeScript.HAN);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With