Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this code showing error invalid unicode?

Tags:

java

eclipse

//System.out.println("hii");'\uxxx'

The println statement is commented but the unicode is not commented.Why?

like image 741
chandankumar patra Avatar asked Dec 08 '22 02:12

chandankumar patra


2 Answers

Java allows you to use Unicode in your source code. Unlike many other languages, it allows you to do so anywhere, including, of course, comments. And it allows it in identifiers as well, so you can write legal Java code like this:

    String हिन्दी = "Hindi";

The variable name is perfectly legal (although coding conventions discourage such use).

So as far as javac is concerned, the source code is Unicode. The problem is that it can be represented with different encodings, and some editors don't support Unicode, and there are places where using a non-ASCII file is going to create problems.

So it is allowed to use Unicode escapes in the code. This will make the file be entirely in ASCII despite having identifiers or comments in Unicode. You can replace any character in the code with the equivalent Unicode escape. Even the "normal" characters like ;. For example, the following line:

String s = "123";

Can be written as:

String s \u003d "123"\u003b

And it will be compiled correctly and without any problems. You can, in fact, write the whole program in Unicode escapes, including the newlines. The Java compiler simply doesn't care if the Unicode escapes are inside literals or in the source itself.

But the upshot of this is that the compiler needs to interpret Unicode escapes first, and only then break the source into tokens such as identifiers, operators and comments, and after that it checks syntax etc.

Which means that if you have an illegal Unicode escape sequence in your source, it will be flagged as an error even though it's inside a comment, because at this point the compiler doesn't even know that the particular part of the code it is looking at is a comment.

like image 105
RealSkeptic Avatar answered Dec 20 '22 10:12

RealSkeptic


Unicode can be represented with \uCODE and not /uCODE. If your unicode is new line and you try to write something after unicode it may show you compile time error.Otherwise inline unicodes are commented in single line comment.No need to specifically comment unicode.

//Compilation Error
//System.out.println("hii"); \u000d Hello

EDIT

When compiler starts it replaces all unicode character with it's value including the characters of comment.

So in above statement during compilation it becomes.

//System.out.println("hii");
Hello
like image 26
akash Avatar answered Dec 20 '22 10:12

akash