Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A unicode newline character(\u000d) in Java

Tags:

java

unicode

Let's see the following code snippet in Java.

public class Main {     public static void main(String[] args) {         // new Character(' \u000d System.out.println("Hello");     } } 

In the above code, although the only line in the main() method is commented out, it displays the output Hello on the console, even though it looks like that this commented line contains some syntax errors. If this line is uncommented, it will not work at all, causing a compile-time error.

Why does it output "Hello" here?

like image 243
Lion Avatar asked Nov 13 '11 23:11

Lion


People also ask

What is u000D in Java?

\u000d represents a newline character in unicode. Java compiler, just before the actual compilation strips out all the unicode characters and coverts it to character form. This parsing is done for the complete source code which includes the comments also.

What is a newline character in Unicode?

LF (character : \n, Unicode : U+000A, ASCII : 10, hex : 0x0a): This is simply the '\n' character which we all know from our early programming days. This character is commonly known as the 'Line Feed' or 'Newline Character'.

What is a newline character in Java?

The newline character, also called end of line (EOL), line break, line feed, line separator or carriage return, is a control character to tell the end of a line of text, and the next character should start at a new line. On the Windows system, it is \r\n , on the Linux system, it is \n . In Java, we can use System.

What is a Unicode character in Java?

Unicode is a computing industry standard designed to consistently and uniquely encode characters used in written languages throughout the world. The Unicode standard uses hexadecimal to express a character. For example, the value 0x0041 represents the Latin character A.


1 Answers

Java parses character escape codes in source code, not just strings.
This allows you to use Unicode identifiers without a Unicode encoding.

Therefore, the \u000d in the comment is parsed as a newline, ending the comment and beginning an instance initializer.

like image 184
SLaks Avatar answered Sep 19 '22 03:09

SLaks