Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex whitespace word boundary

Tags:

regex

I have this expression

\b[A-Za-z]+\b

If I give abc@de mnop, it matches abc, de and mnop, but I want it to match only mnop. How can I do that?

like image 328
Darshana Avatar asked Mar 24 '13 17:03

Darshana


People also ask

Is a word boundary in regex?

The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”. This match is zero-length.

What is the regex for white space?

\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.

What does \b mean in regex?

The word boundary \b matches positions where one side is a word character (usually a letter, digit or underscore—but see below for variations across engines) and the other side is not a word character (for instance, it may be the beginning of the string or a space character).

What is word boundary in regex Java?

A word boundary, in most regex dialects, is a position between \w and \W (non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ( [0-9A-Za-z_] ). So, in the string "-12" , it would match before the 1 or after the 2. The dash is not a word character.


2 Answers

\b is a word boundary.

So, \b is similar to [^a-zA-Z0-9_] i.e \b would check for anything except word

You can instead use this regex

(?<=\s|^)[a-zA-Z]+(?=\s|$)
-------- --------- ------
   |         |       |->match only if the pattern is followed by a space(\s) or end of string/line($)
   |         |->pattern
   |->match only if the pattern is preceded by space(\s) or start of string\line(^)
like image 129
Anirudha Avatar answered Sep 30 '22 14:09

Anirudha


\b means (?:(?<!\w)(?=\w)|(?<=\w)(?!\w)). Which would match positions between letters and @.

You can write:

(?<!\S)[A-Za-z]+(?!\S)

(?!\S) is equivalent to (?=\s|$).

like image 36
Qtax Avatar answered Sep 30 '22 12:09

Qtax