Detect Chinese character in java

Tags:

Using Java how to detect if a String contains Chinese characters?

    String chineseStr = "已下架" ;

if (isChineseString(chineseStr)) {
  System.out.println("The string contains Chinese characters");
}else{
  System.out.println("The string contains Chinese characters");
}

Can you please help me to solve the problem?

413

asked Oct 14 '14 09:10

Ran Deloun

2 Answers

Now Character.isIdeographic(int codepoint) would tell wether the codepoint is a CJKV (Chinese, Japanese, Korean and Vietnamese) ideograph.

Nearer is using Character.UnicodeScript.HAN.

So:

System.out.println(containsHanScript("xxx已下架xxx"));

public static boolean containsHanScript(String s) {
    for (int i = 0; i < s.length(); ) {
        int codepoint = s.codePointAt(i);
        i += Character.charCount(codepoint);
        if (Character.UnicodeScript.of(codepoint) == Character.UnicodeScript.HAN) {
            return true;
        }
    }
    return false;
}

Or in java 8:

public static boolean containsHanScript(String s) {
    return s.codePoints().anyMatch(
            codepoint ->
            Character.UnicodeScript.of(codepoint) == Character.UnicodeScript.HAN);
}

191

answered Sep 19 '22 20:09

Joop Eggen

A more direct approach:

if ("粽子".matches("[\\u4E00-\\u9FA5]+")) {
    System.out.println("is Chinese");
}

If you also need to catch rarely used and exotic characters then you'll need to add all the ranges: What's the complete range for Chinese characters in Unicode?

answered Sep 19 '22 20:09

ccpizza

Related questions
                            
                                FlywayException: Unable to scan for SQL migrations in location: classpath:db/migration
                            
                                Sending UTF-8 string using HttpURLConnection
                            
                                How do I provide a file path in Mac OS X while creating a file in Java?
                            
                                Error in Maven build: mvn.bat not recognized
                            
                                Jackson error "Illegal character... only regular white space allowed" when parsing JSON
                            
                                How to log time taken by Rest web service in Spring Boot?
                            
                                Convert Instant to microseconds from Epoch time
                            
                                Join two WAV files from Java?
                            
                                Class vs. Interface
                            
                                How to convert Vector to String array in java
                            
                                java arraylist ensureCapacity not working
                            
                                Does googles guava have a java tryparse integer method or something similar?
                            
                                Defensive copy of Calendar
                            
                                Content Observer onChange method called twice after 1 change in cursor
                            
                                Mockito using argument matchers for when call on method with variable number of arguments
                            
                                JavaFx 2.x - Swing : Not on FX application thread
                            
                                how to remove Xml version string from String
                            
                                Run function on JFrame close
                            
                                How to avoid java.lang.NoSuchMethodError: org.apache.poi.util.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;) in Apache POI
                            
                                Javafx select multiple rows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Detect Chinese character in java

Tags:

java

encoding

unicode

utf-8

Ran Deloun

People also ask

2 Answers

Joop Eggen

ccpizza

Recent Activity

Donate For Us