Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx: Look-behind to avoid odd number of consecutive backslashes

I have user input where some tags are allowed inside square brackets. I've already wrote the regex pattern to find and validate what's inside the brackets.

In user input field opening-bracket could ([) be escaped with backslash, also backslash could be escaped with another backslash (\). I need look-behind sub-pattern to avoid odd number of consecutive backslashes before opening-bracket.

At the moment I must deal with something like this:

(?<!\\)(?:\\\\)*\[(?<inside brackets>.*?)]

It works fine, but problem is that this code still matches possible pairs of consecutive backslashes in front of brackets (even they are hidden) and look-behind just checks out if there's another single backslash appended to pairs (or directly to opening-bracket). I need to avoid them all inside look-behind group if possible.

Example:

my [test] string is ok
my \[test] string is wrong
my \\[test] string is ok
my \\\[test] string is wrong
my \\\\[test] string is ok
my \\\\\[test] string is wrong
...
etc

I work with PHP PCRE

like image 451
Wh1T3h4Ck5 Avatar asked Mar 08 '12 06:03

Wh1T3h4Ck5


1 Answers

Last time I checked, PHP did not support variable-length lookbehinds. That is why you cannot use the trivial solution (?<![^\\](?:\\\\)*\\).

The simplest workaround would be to simply match the entire thing, not just the brackets part:

(?<!\\)((?:\\\\)*)\[(?<inside_brackets>.*?)]

The difference is that now, if you're using that regex in a preg_replace, you gotta remember to prefix the replacement string by $1, to restore the backslashes being there.

like image 78
Etienne Perot Avatar answered Oct 23 '22 14:10

Etienne Perot