Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression Word Boundary and Special Characters

Tags:

regex

I have a regular expression to escape all special characters in a search string. This works great, however I can't seem to get it to work with word boundaries. For example, with the haystack

add +

or

add (+)

and the needle

+

the regular expression /\+/gi matches the "+". However the regular expression /\b\+/gi doesn't. Any ideas on how to make this work?

Using

add (plus)

as the haystack and /\bplus/gi as the regex, it matches fine. I just can't figure out why the escaped characters are having problems.

like image 694
ggutenberg Avatar asked Jul 13 '10 21:07

ggutenberg


1 Answers

\b is a zero-width assertion: it doesn't consume any characters, it just asserts that a certain condition holds at a given position. A word boundary asserts that the position is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one. (A "word character" is a letter, a digit, or an underscore.) In your string:

add +

...there's a word boundary at the beginning because the a is not preceded by a word character, and there's one after the second d because it's not followed by a word character. The \b in your regex (/\b\+/) is trying to match between the space and the +, which doesn't work because neither of those is a word character.

like image 111
Alan Moore Avatar answered Nov 15 '22 11:11

Alan Moore