I have this expression
\b[A-Za-z]+\b
If I give abc@de mnop
, it matches abc
, de
and mnop
, but I want it to match only mnop
. How can I do that?
The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”. This match is zero-length.
\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.
The word boundary \b matches positions where one side is a word character (usually a letter, digit or underscore—but see below for variations across engines) and the other side is not a word character (for instance, it may be the beginning of the string or a space character).
A word boundary, in most regex dialects, is a position between \w and \W (non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ( [0-9A-Za-z_] ). So, in the string "-12" , it would match before the 1 or after the 2. The dash is not a word character.
\b
is a word boundary.
So, \b
is similar to [^a-zA-Z0-9_]
i.e \b
would check for anything except word
You can instead use this regex
(?<=\s|^)[a-zA-Z]+(?=\s|$)
-------- --------- ------
| | |->match only if the pattern is followed by a space(\s) or end of string/line($)
| |->pattern
|->match only if the pattern is preceded by space(\s) or start of string\line(^)
\b
means (?:(?<!\w)(?=\w)|(?<=\w)(?!\w))
. Which would match positions between letters and @
.
You can write:
(?<!\S)[A-Za-z]+(?!\S)
(?!\S)
is equivalent to (?=\s|$)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With