Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why should I use a different number of escape characters in different situations?

Tags:

java

regex

With regular expressions in Java, why I should write "\n" to define a new line character and "\\s" to define whitespace character?

Why does the quantity of backslashes differs?

like image 643
miptTheBest Avatar asked Aug 03 '14 12:08

miptTheBest


People also ask

Why are escape characters important?

Escape sequences allow you to send nongraphic control characters to a display device. For example, the ESC character (\033) is often used as the first character of a control command for a terminal or printer. Some escape sequences are device-specific.

How many characters does an escape sequence represent?

In C, all escape sequences consist of two or more characters, the first of which is the backslash, \ (called the "Escape character"); the remaining characters determine the interpretation of the escape sequence. For example, \n is an escape sequence that denotes a newline character.


1 Answers

Java does its own string parsing, converting it from your code to an internal string in memory and before it sends the string to the regex parser.

Java converts the 2 characters \n to a linefeed (ASCII code 0x0A) and the first 2 (!) characters in \\s to a single backslash: \s. Now this string is sent to the regex parser, and since regular expressions recognize their own special escaped characters, it treats the \s as "any whitespace".

At this point, the code \n is already stored as a single character "linefeed", and the regular expression does not process it again.

Since regular expressions also recognize the set \n as "a linefeed", you can also use \\n in your Java string -- Java converts the escaped \\ to a single \, and the regular expression module then finds \n, which (again) gets translated into a linefeed.

like image 182
Jongware Avatar answered Oct 12 '22 02:10

Jongware