Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Regex to match words from list

Tags:

c#

regex

I have a following problem with my regular expression, I would like it to match caterpillar in the string "This is a caterpillar s tooth" but it matches cat. How can I change it?

        List<string> women = new List<string>()
        {
            "cat","caterpillar","tooth"
        };

        Regex rgx = new Regex(string.Join("|",women.ToArray()));


        MatchCollection mCol = rgx.Matches("This is a caterpillar s tooth");
        foreach (Match m in mCol)
        {
            //Displays 'cat' and 'tooth' - instead of 'caterpillar' and 'tooth'
            Console.WriteLine(m);
        }
like image 957
Grant Smith Avatar asked Dec 26 '10 19:12

Grant Smith


1 Answers

You need a regex of the form \b(abc|def)\b.
\b is a word separator.

Also, you need to call Regex.Escape on each word.

For example:

new Regex(@"\b(" + string.Join("|", women.Select(Regex.Escape).ToArray()) + @"\b)");
like image 91
SLaks Avatar answered Oct 10 '22 08:10

SLaks