Say for example I have the following string "one two(three) (three) four five"
and I want to replace "(three)"
with "(four)"
but not within words. How would I do it?
Basically I want to do a regex replace and end up with the following string:
"one two(three) (four) four five"
I have tried the following regex but it doesn't work:
@"\b\(three\)\b"
Basically I am writing some search and replace code and am giving the user the usual options to match case, match whole word etc. In this instance the user has chosen to match whole words but I don't know what the text being searched for will be.
A word boundary is a zero-width test between two characters. To pass the test, there must be a word character on one side, and a non-word character on the other side. It does not matter which side each character appears on, but there must be one of each.
The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”.
The \b metacharacter matches at the beginning or end of a word.
For example, the / three / little / pigs / went / to / market. . . . Indivisibility: Say a sentence out loud, and ask someone to 'add extra words' to it. The extra item will be added between the words and not within them.
Your problem stems from a misunderstanding of what \b
actually means. Admittedly, it is not obvious.
The reason \b\(three\)\b
doesn’t match the threes in your input string is the following:
\b
means: the boundary between a word character and a non-word character.(
are considered non-word characters.Here is your input string again, stretched out a bit, and I’ve marked the places where \b
matches:
o n e t w o ( t h r e e ) ( t h r e e ) f o u r f i v e ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑
As you can see here, there is a \b
between “two” and “(three)”, but not before the second “(three)”.
The moral of the story? “Whole-word search” doesn’t really make much sense if what you’re searching for is not just a word (a string of letters). Since you have punctuation characters (parentheses) in your search string, it is not as such a “word”. If you searched for a word consisting only of word characters, then \b
would do what you expect.
You can, of course, use a different Regex to match the string only if it surrounded by spaces or occurs at the beginning or end of the string:
(^|\s)\(three\)(\s|$)
However, the problem with this is, of course, that if you search for “three” (without the parentheses), it won’t find the one in “(three)” because it doesn’t have spaces around it, even though it is actually a whole word.
I think most text editors (including Visual Studio) will use \b
only if your search string actually starts and/or ends with a word character:
var pattern = Regex.Escape(searchString); if (Regex.IsMatch(searchString, @"^\w")) pattern = @"\b" + pattern; if (Regex.IsMatch(searchString, @"\w$")) pattern = pattern + @"\b";
That way they will find “(three)” even if you select “whole words only”.
Here a simple code you may be interested in:
string pattern = @"\b" + find + @"\b"; Regex.Replace(stringToSearch, pattern, replace, RegexOptions.IgnoreCase);
Source code: snip2code - C#: Replace an exact word in a sentence
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With