With regular expressions in Java, why I should write "\n"
to define a new line character and "\\s"
to define whitespace character?
Why does the quantity of backslashes differs?
Escape sequences allow you to send nongraphic control characters to a display device. For example, the ESC character (\033) is often used as the first character of a control command for a terminal or printer. Some escape sequences are device-specific.
In C, all escape sequences consist of two or more characters, the first of which is the backslash, \ (called the "Escape character"); the remaining characters determine the interpretation of the escape sequence. For example, \n is an escape sequence that denotes a newline character.
Java does its own string parsing, converting it from your code to an internal string in memory and before it sends the string to the regex parser.
Java converts the 2 characters \n
to a linefeed (ASCII code 0x0A
) and the first 2 (!) characters in \\s
to a single backslash: \s
. Now this string is sent to the regex parser, and since regular expressions recognize their own special escaped characters, it treats the \s
as "any whitespace".
At this point, the code \n
is already stored as a single character "linefeed", and the regular expression does not process it again.
Since regular expressions also recognize the set \n
as "a linefeed", you can also use \\n
in your Java string -- Java converts the escaped \\
to a single \
, and the regular expression module then finds \n
, which (again) gets translated into a linefeed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With