What are non-word boundary in regex (\B), compared to word-boundary?
A word boundary \b is a test, just like ^ and $ . When the regexp engine (program module that implements searching for regexps) comes across \b , it checks that the position in the string is a word boundary.
A non-word boundary matches any place else: between any pair of characters, both of which are word characters or both of which are not word characters. at the beginning of a string if the first character is a non-word character. at the end of a string if the last character is a non-word character.
A word boundary, in most regex dialects, is a position between \w and \W (non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ( [0-9A-Za-z_] ). So, in the string "-12" , it would match before the 1 or after the 2. The dash is not a word character.
The following three positions are qualified as word boundaries: Before the first character in a string if the first character is a word character. After the last character in a string if the last character is a word character. Between two characters in a string if one is a word character and the other is not.
A word boundary (\b
) is a zero width match that can match:
\w
) and a non-word character (\W
) orIn Javascript the definition of \w
is [A-Za-z0-9_]
and \W
is anything else.
The negated version of \b
, written \B
, is a zero width match where the above does not hold. Therefore it can match:
For example if the string is "Hello, world!"
then \b
matches in the following places:
H e l l o , w o r l d ! ^ ^ ^ ^
And \B
matches those places where \b
doesn't match:
H e l l o , w o r l d ! ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With