I was going through this question C#, Regex.Match whole words
It says for match whole word use "\bpattern\b" This works fine for match whole word without any special characters since it is meant for word characters only!
I need an expression to match words with special characters also. My code is as follows
class Program
{
static void Main(string[] args)
{
string str = Regex.Escape("Hi temp% dkfsfdf hi");
string pattern = Regex.Escape("temp%");
var matches = Regex.Matches(str, "\\b" + pattern + "\\b" , RegexOptions.IgnoreCase);
int count = matches.Count;
}
}
But it fails because of %. Do we have any workaround for this? There can be other special characters like 'space','(',')', etc
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" .
The \b metacharacter matches at the beginning or end of a word.
To run a “whole words only” search using a regular expression, simply place the word between two word boundaries, as we did with ‹ \bcat\b ›. The first ‹ \b › requires the ‹ c › to occur at the very start of the string, or after a nonword character.
If you have non-word characters then you cannot use \b
. You can use the following
@"(?<=^|\s)" + pattern + @"(?=\s|$)"
Edit: As Tim mentioned in comments, your regex is failing precisely because \b
fails to match the boundary between %
and the white-space next to it because both of them are non-word characters. \b
matches only the boundary between word character and a non-word character.
See more on word boundaries here.
Explanation
@"
(?<= # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
# Match either the regular expression below (attempting the next alternative only if this one fails)
^ # Assert position at the beginning of the string
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
)
temp% # Match the characters “temp%” literally
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
# Match either the regular expression below (attempting the next alternative only if this one fails)
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
)
"
If the pattern can contain characters that are special to Regex, run it through Regex.Escape first.
This you did, but do not escape the string that you search through - you don't need that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With