Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Right way to escape backslash [ \ ] in PHP regex?

Tags:

regex

php

Just out of curiosity, I'm trying to figure out which exactly is the right way to escape a backslash for use in a PHP regular expression pattern like so:

TEST 01: (3 backslashes)

$pattern = "/^[\\\]{1,}$/"; $string = '\\';  // ----- RETURNS A MATCH ----- 

TEST 02: (4 backslashes)

$pattern = "/^[\\\\]{1,}$/"; $string = '\\';  // ----- ALSO RETURNS A MATCH ----- 

According to the articles below, 4 is supposedly the right way but what confuses me is that both tests returned a match. If both are right, then is 4 the preferred way?

RESOURCES:

  • http://www.developwebsites.net/match-backslash-preg_match-php/
  • Can't escape the backslash with regex?
like image 630
Mahmoud Tahan Avatar asked Jun 15 '12 03:06

Mahmoud Tahan


People also ask

How do you backslash in RegEx?

The backslash suppresses the special meaning of the character it precedes, and turns it into an ordinary character. To insert a backslash into your regular expression pattern, use a double backslash ('\\').

Why backslash is used in RegEx?

2.7 Backslash (\) and Regex Escape Sequences Regex uses backslash ( \ ) for two purposes: for metacharacters such as \d (digit), \D (non-digit), \s (space), \S (non-space), \w (word), \W (non-word). to escape special regex characters, e.g., \. for . , \+ for + , \* for * , \? for ? .

What is the use of backslash in PHP?

In PHP, an escape sequence starts with a backslash \ . Escape sequences apply to double-quoted strings. A single-quoted string only uses the escape sequences for a single quote or a backslash.


2 Answers

// PHP 5.4.1  // Either three or four \ can be used to match a '\'. echo preg_match( '/\\\/', '\\' );        // 1 echo preg_match( '/\\\\/', '\\' );       // 1  // Match two backslashes `\\`. echo preg_match( '/\\\\\\/', '\\\\' );   // Warning: No ending delimiter '/' found echo preg_match( '/\\\\\\\/', '\\\\' );  // 1 echo preg_match( '/\\\\\\\\/', '\\\\' ); // 1  // Match one backslash using a character class. echo preg_match( '/[\\]/', '\\' );       // 0 echo preg_match( '/[\\\]/', '\\' );      // 1   echo preg_match( '/[\\\\]/', '\\' );     // 1 

When using three backslashes to match a '\' the pattern below is interpreted as match a '\' followed by an 's'.

echo preg_match( '/\\\\s/', '\\ ' );    // 0   echo preg_match( '/\\\\s/', '\\s' );    // 1   

When using four backslashes to match a '\' the pattern below is interpreted as match a '\' followed by a space character.

echo preg_match( '/\\\\\s/', '\\ ' );   // 1 echo preg_match( '/\\\\\s/', '\\s' );   // 0 

The same applies if inside a character class.

echo preg_match( '/[\\\\s]/', ' ' );   // 0  echo preg_match( '/[\\\\\s]/', ' ' );  // 1  

None of the above results are affected by enclosing the strings in double instead of single quotes.

Conclusions:
Whether inside or outside a bracketed character class, a literal backslash can be matched using just three backslashes '\\\' unless the next character in the pattern is also backslashed, in which case the literal backslash must be matched using four backslashes.

Recommendation:
Always use four backslashes '\\\\' in a regex pattern when seeking to match a backslash.

Escape sequences.

like image 104
MikeM Avatar answered Oct 05 '22 23:10

MikeM


To avoid this kind of unclear code you can use \x5c Like this :)

echo preg_replace( '/\x5c\w+\.php$/i', '<b>${0}</b>', __FILE__ ); 
like image 28
Олег Всильдеревьев Avatar answered Oct 05 '22 22:10

Олег Всильдеревьев