4 byte unicode character in Java

Tags:

I am writing unit tests for my custom StringDatatype, and I need to write down 4 byte unicode character. "\U" - not working (illegal escape character error) for example: U+1F701 (0xf0 0x9f 0x9c 0x81). How it can be written in a string?

503

asked Dec 04 '14 06:12

Constantine

2 Answers

A Unicode code point is not 4 bytes; it is an integer (ranging, at the moment, from U+0000 to U+10FFFF).

Your 4 bytes are (wild guess) its UTF-8 encoding version (edit: I was right).

You need to do this:

Click to copy

final char[] chars = Character.toChars(0x1F701);
final String s = new String(chars);
final byte[] asBytes = s.getBytes(StandardCharsets.UTF_8);

When Java was created, Unicode did not define code points outside the BMP (ie, U+0000 to U+FFFF), which is the reason why a char is only 16 bits long (well, OK, this is only a guess, but I think I'm not far off the mark here); since then, well, it had to adapt... And code points outside the BMP need two chars (a leading surrogate and a trailing surrogate -- Java calls these a high and low surrogate respectively). There is no character literal in Java allowing to enter code points outside the BMP directly.

Given that a char is, in fact, a UTF-16 code unit and that there are string literals for these, you can input this "character" in a String as "\uD83D\uDF01" -- or directly as the symbol if your computing environment has support for it.

See also the CharsetDecoder and CharsetEncoder classes.

See also String.codePointCount(), and, since Java 8, String.codePoints() (inherited from CharSequence).

answered Oct 19 '22 05:10

fge

String s = "𩸽";

Technically this is one character. But be careful s.length() will returns 2. Also java won't compile String s = '𩸽'. Java don't promise you that String.length() shall returns exact number of characters, it returns just number of java-chars required for store this string.

Real number of characters can be obtained from s.codePointCount(0, s.length()).

answered Oct 19 '22 03:10

Andrew

Related questions
                            
                                Testing for even numbers in Java without modulo operator
                            
                                Collections.sort implementation
                            
                                Check the existence of a HashMap key
                            
                                How do I compare a character to check if it is null?
                            
                                Should a class with only static methods be abstract?
                            
                                java urlconnection get the final redirected URL
                            
                                Method call with Generic return type in Java
                            
                                Why and when to unregister content observers in android
                            
                                What is Object Reference Variable? [duplicate]
                            
                                PriorityQueue and PriorityBlockingQueue
                            
                                Can I make this function more efficient (Project Euler Number 9)?
                            
                                ATOMIC_MOVE gives exceptions
                            
                                Java Error: The constructor is undefined
                            
                                java.sq.SQLException: Column not found
                            
                                read and write data with GSON
                            
                                Error: Main method not found in class Calculate, please define the main method as: public static void main(String[] args) [duplicate]
                            
                                Thymeleaf #lists.contains() expression utility not working
                            
                                Efficient way to write InputStream to a File in Java 6
                            
                                How do you change the size and font of a joptionpane?
                            
                                Cache using ConcurrentHashMap

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

4 byte unicode character in Java

Tags:

java

unicode

Constantine

People also ask

2 Answers

fge

Andrew

Recent Activity

Donate For Us