Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Searching for dictionary keys contained in a string array

I have a List of strings where each item is a free text describing a skill, so looks kinda like this:

List<string> list = new List<string> {"very good right now", "pretty good",
 "convinced me that is good", "pretty medium", "just medium" .....}

And I want to keep a user score for these free texts. So for now, I use conditions:

foreach (var item in list)
        {
            if (item.Contains("good"))
            {
                score += 2.5;
                Console.WriteLine("good skill, score+= 2.5, is now {0}", score);
            }
            else if (item.Contains(low"))
            {
                score += 1.0;
                Console.WriteLine("low skill, score+= 1.0, is now {0}", score);
            }

        }

Suppose In the furure I want to use a dictionary for the score mapping, such as:

Dictionary<string, double> dic = new Dictionary<string, double>
{ { "good", 2.5 }, { "low", 1.0 }};

What would be a good way to cross between the dictionary values and the string list? The way I see it now is do a nested loop:

foreach (var item in list)
        {
            foreach (var key in dic.Keys)
                if (item.Contains(key))
                    score += dic[key];
        }

But I'm sure there are better ways. Better being faster, or more pleasant to the eye (LINQ) at the very least.

Thanks.

like image 795
yoad w Avatar asked May 30 '17 20:05

yoad w


2 Answers

var scores = from item in list
             from word in item.Split()
             join kvp in dic on word equals kvp.Key
             select kvp.Value;

var totalScore = scores.Sum();

Note: your current solution checks whether the item in the list contains key in the dictionary. But it will return true even if key in dictionary is a part of some word in the item. E.g. "follow the rabbit" contains "low". Splitting item into words solves this issue.

Also LINQ join uses hash set internally to search first sequence items in second sequence. That gives you O(1) lookup speed instead of O(N) when you enumerating all entries of dictionary.

like image 173
Sergey Berezovskiy Avatar answered Oct 27 '22 00:10

Sergey Berezovskiy


If your code finds N skill strings containing the word "good" then it appends score 2.5 N times.

So you can just count skill strings containing dictionary work and multiply the value on corresponding score.

var scores = from pair in dic
             let word = pair.Key
             let score = pair.Value
             let count = list.Count(x => x.Contains(word))
             select score * count;

var totalScore = scores.Sum();
like image 35
Mark Shevchenko Avatar answered Oct 26 '22 23:10

Mark Shevchenko