Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to find members that exist in at least two lists in a list of lists

Tags:

c#

list

I have an array of lists:

var stringLists = new List<string>[] 
{ 
    new List<string>(){ "a", "b", "c" },
    new List<string>(){ "d", "b", "c" },
    new List<string>(){ "a", "d", "c" }
};

I want to extract all elements that are common in at least 2 lists. So for this example, I should get all elements ["a", "b", "c", "d"]. I know how to find elements common to all but couldn't think of any way to solve this problem.

like image 728
AnishaJain Avatar asked Jul 13 '15 12:07

AnishaJain


2 Answers

You could use something like this:

var result = stringLists.SelectMany(l => l.Distinct())
                        .GroupBy(e => e)
                        .Where(g => g.Count() >= 2)
                        .Select(g => g.Key);

Just for fun some iterative solutions:

var seen = new HashSet<string>();
var current = new HashSet<string>();
var result = new HashSet<string>();
foreach (var list in stringLists)
{
    foreach(var element in list)
        if(current.Add(element) && !seen.Add(element))
            result.Add(element);

    current.Clear();
}

or:

var already_seen = new Dictionary<string, bool>();
foreach(var list in stringLists)
    foreach(var element in list.Distinct())
         already_seen[element] = already_seen.ContainsKey(element);

var result = already_seen.Where(kvp => kvp.Value).Select(kvp => kvp.Key);

or (inspired by Tim's answer):

int tmp;
var items = new Dictionary<string,int>();

foreach(var str in stringLists.SelectMany(l => l.Distinct()))
{
    items.TryGetValue(str, out tmp);
    items[str] = tmp + 1;
}

var result = items.Where(kv => kv.Value >= 2).Select(kv => kv.Key);
like image 78
sloth Avatar answered Nov 18 '22 07:11

sloth


You could use a Dictionary<string, int>, the key is the string and the value is the count:

Dictionary<string, int> itemCounts = new Dictionary<string,int>();
for(int i = 0; i < stringLists.Length; i++)
{
    List<string> list = stringLists[i];
    foreach(string str in list.Distinct())
    {
        if(itemCounts.ContainsKey(str))
           itemCounts[str] += 1;
        else
            itemCounts.Add(str, 1);
    }
}
var result = itemCounts.Where(kv => kv.Value >= 2);

I use list.Distinct() since you only want to count occurences in different lists.

As requested, here is an extension method which you can reuse with any type:

public static IEnumerable<T> GetItemsWhichOccurAtLeastIn<T>(this IEnumerable<IEnumerable<T>> seq, int minCount, IEqualityComparer<T> comparer = null)
{
    if (comparer == null) comparer = EqualityComparer<T>.Default;
    Dictionary<T, int> itemCounts = new Dictionary<T, int>(comparer);

    foreach (IEnumerable<T> subSeq in seq)
    {
        foreach (T x in subSeq.Distinct(comparer))
        {
            if (itemCounts.ContainsKey(x))
                itemCounts[x] += 1;
            else
                itemCounts.Add(x, 1);
        }
    }
    foreach(var kv in itemCounts.Where(kv => kv.Value >= minCount))
        yield return kv.Key;
}

Usage is simple:

string result = String.Join(",", stringLists.GetItemsWhichOccurAtLeastIn(2)); // a,b,c,d
like image 42
Tim Schmelter Avatar answered Nov 18 '22 05:11

Tim Schmelter