Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Partial ungrouping list of duplicate values

Tags:

c#

linq

I know how to group data using LINQ, and I know how to split it into separate items, but I have no idea how to only partially ungroup it.

I have a set of data which looks something like this:

var data = new Dictionary<Header, Detail>()
{
    { new Header(), new Detail { Parts = new List<string> { "Part1", "Part1", "Part2" } } }
};

In order to process this correctly, I need every instance of a duplicate part to be a separate entry in the dictionary (although it doesn't matter if it remains a dictionary - IEnumerable<KeyValuePair<Header, Detail>> is perfectly acceptable). However, I don't want to split up the Parts list entirely - having different parts in the list is fine.

Specifically, I want the end data to look like this:

{
  { new Header(), new Detail { Parts = new List<string> { "Part1", "Part2" } } },
  { new Header(), new Detail { Parts = new List<string> { "Part1" } } },
}

For a more complex example:

var data = new Dictionary<Header, Detail>()
{
    { new Header(1), new Detail { Parts = new List<string> { "Part1", "Part1", "Part2" } } },

    { new Header(2), new Detail { Parts = new List<string> { "Part1", "Part2" } } },

    { new Header(3), new Detail { Parts = new List<string> { "Part1", "Part2", "Part2", "Part2", "Part3", "Part3"} } }
};

var desiredOutput = new List<KeyValuePair<Header, Detail>>()
{
    { new Header(1), new Detail { Parts = new List<string> { "Part1", "Part2" } } },
    { new Header(1), new Detail { Parts = new List<string> { "Part1" } } },

    { new Header(2), new Detail { Parts = new List<string> { "Part1", "Part2" } } },

    { new Header(3), new Detail { Parts = new List<string> { "Part1", "Part2", "Part 3" } } },
    { new Header(3), new Detail { Parts = new List<string> { "Part2", "Part3" } } },
    { new Header(3), new Detail { Parts = new List<string> { "Part2" } } }
};

Any advice?

like image 455
Bobson Avatar asked Nov 07 '12 14:11

Bobson


2 Answers

Linq will not much help you here, but here is an extension method, which will do the trick:

public static IEnumerable<KeyValuePair<Header, Detail>> UngroupParts(
    this IEnumerable<KeyValuePair<Header, Detail>> data)
{
    foreach (var kvp in data)
    {
        Header header = kvp.Key;
        List<string> parts = kvp.Value.Parts.ToList();
        do
        {
            List<string> distinctParts = parts.Distinct().ToList();
            Detail detail = new Detail() { Parts = distinctParts };
            yield return new KeyValuePair<Header, Detail>(header, detail);

            foreach (var part in distinctParts)
                parts.Remove(part);
        }
        while (parts.Any());
    }
}

Usage:

var desiredOutput = data.UngroupParts();
like image 185
Sergey Berezovskiy Avatar answered Sep 29 '22 04:09

Sergey Berezovskiy


No, there isn't really an existing LINQ function that does all of this.

Essentially, if you were to imagine grouping Parts by each string, and thinking of each group as a row, what you want is each "column". I did this with a helper function GetNthValues (which is designed to model the LINQ style of functions). Once you have that, it's pretty much just a matter of doing the grouping on each part, calling the function, and putting the results back into a dictionary.

public static Dictionary<Header, Detail> Ungroup(Dictionary<Header, Detail> input)
{
    var output = new Dictionary<Header, Detail>();

    foreach (var key in input.Keys)
    {
        var lookup = input[key].Parts.ToLookup(part => part);

        bool done = false;

        for (int i = 0; !done; i++)
        {
            var parts = lookup.GetNthValues(i).ToList();
            if (parts.Any())
            {
                output.Add(new Header(key.Value), new Detail { Parts = parts });
            }
            else
            {
                done = true;
            }
        }
    }

    return output;
}

public static IEnumerable<TElement> GetNthValues<TKey, TElement>(
    this ILookup<TKey, TElement> source, int n)
{
    foreach (var group in source)
    {
        if (group.Count() > n)
        {
            yield return group.ElementAt(n);
        }
    }
}
like image 42
Servy Avatar answered Sep 29 '22 02:09

Servy