Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grouping a list into sublists by key/attribute without changing the list order (just chop into lists by attrib)

Tags:

c#

list

linq

I've hunted around for this, and I am sure that I am just missing it because I not that good with Linq.

I have a list that looks like:

type=a, value=aaaa
type=a, value=bbbb
type=b, value=cccc
type=d, value=dddd
type=d, value=eeee
type=d, value=ffff
type=a, value=gggg
type=b, value=hhhh
type=b, value=iiii
type=b, value=jjjj

I would like to break this into sub lists, without sorting (I need the original order maintained). I would like to get back these lists, in this order in a list of lists or something similar:

List 1
type=a, value=aaaa
type=a, value=bbbb

List2
type=b, value=cccc

List 3
type=d, value=dddd
type=d, value=eeee
type=d, value=ffff

List 4
type=a, value=gggg

List 5
type=b, value=hhhh
type=b, value=iiii
type=b, value=jjjj

I would imagine that looping is not the best answer.

Any ideas much appreciated.

Long live stackoverflow.com!

Chris

Edit After Four Answers:

I checked the answers from: * Enigmativity * Bert Evans * Risky Martin * david.s

They all work nicely. Bert Evans brought up performance, which isn't a big concern for me in this case, but I did some quick checking for the sake of the post.

I didn't modify anyone's code, just timed it doing 4,000 of these operations on rather short lists.

Risky's answer was the fastest. Bert's answer was only a tad slower.

david's and Enigmativity were hardly any slower, really.

I marked Risky's answer as accepted because of performance and for pointing to the related post early on, and then coming back to provide an answer.

I would agree that Bert's is the most readable, though.

I honestly don't know which one I will use... actually, I will use Enigmativity's solution because it has already taken into account that i only need the values and one key per subgroup.

like image 619
chrismead Avatar asked Dec 20 '22 18:12

chrismead


2 Answers

Using a slightly modified extension method from this answer and this answer:

public static IEnumerable<IGrouping<int, T>> GroupConsecutive<T>(this IEnumerable<T> set, Func<T, T, bool> predicate)
{
    var i = 0;
    var k = 0;
    var ranges = from e in set
                    let idx = ++i
                    let next = set.ElementAtOrDefault(idx)
                    let key = next == null ? k : predicate(e, next) ? k : k++
                    group e by key into g
                    select g;
    return ranges;
}

And given a class:

public class Foo
{
    public string Type { get; set; }
    public string Value { get; set; }
}

You can do this:

List<Foo> list = new List<Foo>()
{
    new Foo() { Type = "a", Value = "aaaa" },
    new Foo() { Type = "a", Value = "bbbb" },
    new Foo() { Type = "b", Value = "cccc" },
    new Foo() { Type = "d", Value = "dddd" },
    new Foo() { Type = "d", Value = "eeee" },
    new Foo() { Type = "d", Value = "ffff" },
    new Foo() { Type = "a", Value = "gggg" },
    new Foo() { Type = "b", Value = "hhhh" },
    new Foo() { Type = "b", Value = "iiii" },
    new Foo() { Type = "b", Value = "jjjj" }
};

var groups = list.GroupConsecutive((a, b) => a.Type == b.Type);

foreach (var group in groups)
{
    Console.WriteLine("List " + group.Key);
    foreach (var item in group)
    {
        Console.WriteLine("Type=" + item.Type + "    Value=" + item.Value);
    }
    Console.WriteLine();
}

And the result would look like this:

List 0
Type=a    Value=aaaa
Type=a    Value=bbbb

List 1
Type=b    Value=cccc

List 2
Type=d    Value=dddd
Type=d    Value=eeee
Type=d    Value=ffff

List 3
Type=a    Value=gggg

List 4
Type=b    Value=hhhh
Type=b    Value=iiii
Type=b    Value=jjjj
like image 114
david.s Avatar answered Dec 30 '22 11:12

david.s


This could be generalized or made into an extension method, but you get the idea:

public static IEnumerable<List<Item>> GroupConsecutive(IEnumerable<Item> items)
{
    if (items.Any())
    {
        string firstType = items.Select(i => i.Type).First();
        var adjacents = items.TakeWhile(i => i.Type == firstType).ToList();
        yield return adjacents;
        foreach (var group in GroupConsecutive(items.Skip(adjacents.Count)))
        {
            yield return group;
        }
    }
}

Using this class:

public class Item
{
    public string Type { get; set; }
    public string Value { get; set; }
}

Edit: Here are the tradeoffs for this solution:

Pros:

  • Returns a lazily evaluated collection
  • Doesn't mutate any variables
  • Is concise

Cons:

  • Iterates through items twice. This isn't a big deal if items is a List, but if items is an IEnumerable that performs an expensive computation for each item, this method could be slower than other methods.

If you want items to be iterated once with lazy evaluation, I recommend the GroupAdjacent extension method as mentioned in this answer or looping with yield return. If you want one iteration without lazy evaluation, I recommend looping or the Aggregate method.

like image 40
Risky Martin Avatar answered Dec 30 '22 10:12

Risky Martin