Comparing a char to a code-point?

Tags:

unicode

What is the "correct" way of comparing a code-point to a Java character? For example:

int codepoint = String.codePointAt(0); char token = '\n';

I know I can probably do:

if (codepoint==(int) token) { ... }

but this code looks fragile. Is there a formal API method for comparing codepoints to chars, or converting the char up to a codepoint for comparison?

635

asked Jun 22 '09 23:06

1 Answers

A little bit of background: When Java appeared in 1995, the char type was based on the original "Unicode 88" specification, which was limited to 16 bits. A year later, when Unicode 2.0 was implemented, the concept of surrogate characters was introduced to go beyond the 16 bit limit.

Java internally represents all Strings in UTF-16 format. For code points exceeding U+FFFF the code point is represented by a surrogate pair, i.e., two chars with the first being the high-surrogates code unit, (in the range \uD800-\uDBFF), the second being the low-surrogate code unit (in the range \uDC00-\uDFFF).

From the early days, all basic Character methods were based on the assumption that a code point could be represented in one char, so that's what the method signatures look like. I guess to preserve backward compatibility that was not changed when Unicode 2.0 came around and caution is needed when dealing with them. To quote from the Java documentation:

The methods that only accept a char value cannot support supplementary characters. They treat char values from the surrogate ranges as undefined characters. For example, Character.isLetter('\uD840') returns false, even though this specific value if followed by any low-surrogate value in a string would represent a letter.
The methods that accept an int value support all Unicode characters, including supplementary characters. For example, Character.isLetter(0x2F81A) returns true because the code point value represents a letter (a CJK ideograph).

Casting the char to an int, as you do in your sample, works fine though.

144

answered Oct 09 '22 09:10

Christian Hang-Hicks

Related questions
                            
                                Java Modcount (ArrayList)
                            
                                Creating classes dynamically with Java
                            
                                How do you upload a file to an FTP server?
                            
                                RestTemplate -- default timeout value
                            
                                how to use math.pi in java
                            
                                Mockito: wait for an invocation that matches arguments
                            
                                How to use Java 8 streams to find all values preceding a larger value?
                            
                                In JDBC, why do parameter indexes for prepared statements begin at 1 instead of 0?
                            
                                Does the preparedStatement avoid SQL injection? [duplicate]
                            
                                Is there any reason to prefer System.arraycopy() over clone()?
                            
                                Adding to an ArrayList Java
                            
                                When to use Servlet or @Controller
                            
                                Generic tree implementation in Java
                            
                                In Java is Permanent Generation space garbage collected?
                            
                                Converting ArrayList of Characters to a String?
                            
                                Where to put text files in directory in Android
                            
                                Responsibilities and use of Service and DAO Layers
                            
                                using Spring JdbcTemplate - injecting datasource vs jdbcTemplate
                            
                                Explanation on Integer.MAX_VALUE and Integer.MIN_VALUE to find min and max value in an array
                            
                                Confused about the Visitor Design Pattern

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Comparing a char to a code-point?

Tags:

java

unicode

Gili

People also ask

1 Answers

Christian Hang-Hicks

Recent Activity

Donate For Us