Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if the word is Japanese or English?

Tags:

java

I want to have a different process for English word and Japanese word in this method:

if (english) {
    // say english
} else {
    // say not english
}

How can I achieve this in JSP?

like image 832
Kareem Nour Emam Avatar asked Dec 13 '22 05:12

Kareem Nour Emam


1 Answers

Japanese characters lies within certain Unicode ranges:

  • U+3040–U+309F: Hiragana
  • U+30A0–U+30FF: Katakana
  • U+4E00–U+9FBF: Kanji

So all you basically need to do is to check if the character's codepoint lies within the known ranges.

Set<UnicodeBlock> japaneseUnicodeBlocks = new HashSet<UnicodeBlock>() {{
    add(UnicodeBlock.HIRAGANA);
    add(UnicodeBlock.KATAKANA);
    add(UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS);
}};

String mixed = "This is a Japanese newspaper headline: ラドクリフ、マラソン五輪代表に1万m出場にも含み";

for (char c : mixed.toCharArray()) {
    if (japaneseUnicodeBlocks.contains(UnicodeBlock.of(c))) {
        System.out.println(c + " is a Japanese character");
    } else {
        System.out.println(c + " is not a Japanese character");
    }
}

It's unclear when exactly you'd like to say Japanese back. When the string contains mixed Japanese and Latin (or other!) characters, or when the string contains only Japanese characters. The above example should at least be a good starting point.

Please note that this all is completely unrelated to JSP. JSP is just a web presentation technology which allows you to generate HTML/CSS/JS code dynamically. Writing Java code inside JSP files is considered a bad practice.

like image 91
BalusC Avatar answered Dec 31 '22 10:12

BalusC