Just out of curiosity, I'm trying to figure out which exactly is the right way to escape a backslash for use in a PHP regular expression pattern like so:
TEST 01: (3 backslashes)
$pattern = "/^[\\\]{1,}$/"; $string = '\\'; // ----- RETURNS A MATCH -----
TEST 02: (4 backslashes)
$pattern = "/^[\\\\]{1,}$/"; $string = '\\'; // ----- ALSO RETURNS A MATCH -----
According to the articles below, 4 is supposedly the right way but what confuses me is that both tests returned a match. If both are right, then is 4 the preferred way?
RESOURCES:
The backslash suppresses the special meaning of the character it precedes, and turns it into an ordinary character. To insert a backslash into your regular expression pattern, use a double backslash ('\\').
2.7 Backslash (\) and Regex Escape Sequences Regex uses backslash ( \ ) for two purposes: for metacharacters such as \d (digit), \D (non-digit), \s (space), \S (non-space), \w (word), \W (non-word). to escape special regex characters, e.g., \. for . , \+ for + , \* for * , \? for ? .
In PHP, an escape sequence starts with a backslash \ . Escape sequences apply to double-quoted strings. A single-quoted string only uses the escape sequences for a single quote or a backslash.
// PHP 5.4.1 // Either three or four \ can be used to match a '\'. echo preg_match( '/\\\/', '\\' ); // 1 echo preg_match( '/\\\\/', '\\' ); // 1 // Match two backslashes `\\`. echo preg_match( '/\\\\\\/', '\\\\' ); // Warning: No ending delimiter '/' found echo preg_match( '/\\\\\\\/', '\\\\' ); // 1 echo preg_match( '/\\\\\\\\/', '\\\\' ); // 1 // Match one backslash using a character class. echo preg_match( '/[\\]/', '\\' ); // 0 echo preg_match( '/[\\\]/', '\\' ); // 1 echo preg_match( '/[\\\\]/', '\\' ); // 1
When using three backslashes to match a '\'
the pattern below is interpreted as match a '\'
followed by an 's'
.
echo preg_match( '/\\\\s/', '\\ ' ); // 0 echo preg_match( '/\\\\s/', '\\s' ); // 1
When using four backslashes to match a '\'
the pattern below is interpreted as match a '\'
followed by a space character.
echo preg_match( '/\\\\\s/', '\\ ' ); // 1 echo preg_match( '/\\\\\s/', '\\s' ); // 0
The same applies if inside a character class.
echo preg_match( '/[\\\\s]/', ' ' ); // 0 echo preg_match( '/[\\\\\s]/', ' ' ); // 1
None of the above results are affected by enclosing the strings in double instead of single quotes.
Conclusions:
Whether inside or outside a bracketed character class, a literal backslash can be matched using just three backslashes '\\\'
unless the next character in the pattern is also backslashed, in which case the literal backslash must be matched using four backslashes.
Recommendation:
Always use four backslashes '\\\\'
in a regex pattern when seeking to match a backslash.
Escape sequences.
To avoid this kind of unclear code you can use \x5c Like this :)
echo preg_replace( '/\x5c\w+\.php$/i', '<b>${0}</b>', __FILE__ );
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With