Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I make a char bound to the unicode castle character in Java?

class A {
    public static void main(String[] args) {
        char a = '∀';
        System.out.println(a);
        char castle = '𝍇';
        System.out.println(castle);
    }
}

I can make a char for the upside down A just fine, but when I try to make the castle char it gets 3 compile errors. Why?

$ javac A.java && java A
A.java:5: unclosed character literal
        char castle = '𝍇';
                      ^
A.java:5: illegal character: \57159
        char castle = '𝍇';
                        ^
A.java:5: unclosed character literal
        char castle = '𝍇';
                         ^
3 errors
like image 894
Dog Avatar asked May 24 '13 21:05

Dog


2 Answers

I suspect that the castle character does not fit in a single char, but rather, requires an int code point. In that case, you could use it in a String literal, but not as a char.

The Javadoc for Character states:

The char data type (and therefore the value that a Character object encapsulates) are based on the original Unicode specification, which defined characters as fixed-width 16-bit entities. The Unicode standard has since been changed to allow for characters whose representation requires more than 16 bits. The range of legal code points is now U+0000 to U+10FFFF, known as Unicode scalar value.

So my guess would be that that character requires more than 16 bits, so it would need to be treated as an int code point.

like image 124
Louis Wasserman Avatar answered Oct 31 '22 11:10

Louis Wasserman


If your source code file contains non-ascii characters (as does this file), you need to ensure that javac reads it with the proper encoding, otherwise it will default to an encoding that is possibly not the one in which it was saved.

So, if you saved your file in UTF-8 from your editor, you can compile it using:

javac -encoding utf8 A.java

Note that you can also use the unicode codepoint instead of the actual character, this makes the code compilable without the -encoding directive:

char a = '\u2200';              // The codepoint for the ∀ character
String castle = "\ud834\udf47"; // Being a high-surrogate char, the castle cannot fit a single 16 bit char
like image 35
EmirCalabuch Avatar answered Oct 31 '22 13:10

EmirCalabuch