Consider the array:
new Pattern[] {Pattern.compile("\\["),Pattern.compile("\\]") };
Intellij IDEA tells me that \\
is redundant and tells me to replace this with ]
e.g. the result is:
new Pattern[] {Pattern.compile("\\["),Pattern.compile("]") };
Why in the first Pattern.compile("\\[")
is the \\
OK, but for the second it is redundant?
Use square brackets as escape characters for the percent sign, the underscore, and the left bracket. The right bracket does not need an escape character; use it by itself. If you use the hyphen as a literal character, it must be the first character inside a set of square brackets.
Do brackets need to be escaped in regex? Although dot ( . ) has special meaning in regex, in a character class (square brackets) any characters except ^ , – , ] or \ is a literal, and do not require escape sequence.
An escape can be either enclosing the phrase in braces, or placing a backslash before the escaped character. To pass a left bracket to the regular expression parser to evaluate as a range of characters takes 1 escape.
A string enclosed in square brackets matches any one character in the string. 1. For example, regular expression [abc] matches a , b , or c . Within bracket_expression, certain characters have special meanings, as follows: 2.
The ]
symbol is not a special regex operator outside the character class if there is no corresponding unescaped [
before it. Only special characters require escaping. A [
is a special regex operator outside a character class (as it may mark the starting point of a character class). Once the Java regular expression engine sees an unescaped [
in the pattern, it knows there must be a ]
to close the character class ahead. Whether it is escaped or not, it does not matter for the engine. If there is no opening [
in the expression, the ]
is treated as a mere literal ]
symbol. So, [abc]
will match a
, b
or c
, and \[abc]
or \[abc\]
will match [abc]
literal character sequence.
So, the [
should be escaped always, and ]
does not have to be escaped outside a character class.
When used inside a character class, both [
and ]
must be escaped inside a Java regular expression as they may form intersection/subtraction patterns, unless the ]
appears at the beginning of a character class (i.e. "[a]".replaceAll("[]\\[]", "")
returns a
).
Other regex flavors
icu onigmo - In ICU and Onigmo regex flavor, ]
behaves the same as in Java regex flavor. Languages affected: swift, ruby, r (stringr
), kotlin, groovy.
pcre boost .net re2 python posix - In Boost, PCRE, ]
is not a special char (i.e. needs no escaping) outside a character class, and is a special char (=needs escaping) inside a character class (where it does not need escaping only if it is the first char in the character class.) It is not an error to escape it everywhere where it is supposted to match a literal ]
char. Languages/tools affected: php, perl, c#/vb.net/etc., python, sed, grep, awk, elixir, r (both default base R TRE and PCRE enabled with "perl=TRUE"
), tcl, google-sheets.
ecmascript - In ECMAScript flavors, ]
is not special outside a character class, while [
is special outside a character class. Inside a character class, ]
must ALWAYS be escaped, even if it is the first char in the character class. [
inside a character class is not special, but escaping it is an error if the regexp is compiled with the /u
flag (in JavaScript). So, be careful here. Languages affected: javascript, dart, c++, vba, google-apps-script (which uses JavaScript).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With