Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get distinct instance from a list by Lambda or LINQ

I have a class like this:

class MyClass<T> {
    public string value1 { get; set; }
    public T objT { get; set; }
}

and a list of this class. I would like to use .net 3.5 lambda or linq to get a list of MyClass by distinct value1. I guess this is possible and much simpler than the way in .net 2.0 to cache a list like this:

List<MyClass<T>> list; 
...
List<MyClass<T>> listDistinct = new List<MyClass<T>>();
foreach (MyClass<T> instance in list)
{
    // some code to check if listDistinct does contain obj with intance.Value1
    // then listDistinct.Add(instance);
}

What is the lambda or LINQ way to do it?

like image 211
David.Chu.ca Avatar asked Jul 26 '09 00:07

David.Chu.ca


3 Answers

Both Marc's and dahlbyk's answers seem to work very well. I have a much simpler solution though. Instead of using Distinct, you can use GroupBy. It goes like this:

var listDistinct
    = list.GroupBy(
        i => i.value1,
        (key, group) => group.First()
    ).ToArray();

Notice that I've passed two functions to the GroupBy(). The first is a key selector. The second gets only one item from each group. From your question, I assumed First() was the right one. You can write a different one, if you want to. You can try Last() to see what I mean.

I ran a test with the following input:

var list = new [] {
    new { value1 = "ABC", objT = 0 },
    new { value1 = "ABC", objT = 1 },
    new { value1 = "123", objT = 2 },
    new { value1 = "123", objT = 3 },
    new { value1 = "FOO", objT = 4 },
    new { value1 = "BAR", objT = 5 },
    new { value1 = "BAR", objT = 6 },
    new { value1 = "BAR", objT = 7 },
    new { value1 = "UGH", objT = 8 },
};

The result was:

//{ value1 = ABC, objT = 0 }
//{ value1 = 123, objT = 2 }
//{ value1 = FOO, objT = 4 }
//{ value1 = BAR, objT = 5 }
//{ value1 = UGH, objT = 8 }

I haven't tested it for performance. I believe that this solution is probably a little bit slower than one that uses Distinct. Despite this disadvantage, there are two great advantages: simplicity and flexibility. Usually, it's better to favor simplicity over optimization, but it really depends on the problem you're trying to solve.

like image 101
jpbochi Avatar answered Nov 20 '22 06:11

jpbochi


Hmm... I'd probably write a custom IEqualityComparer<T> so that I can use:

var listDistinct = list.Distinct(comparer).ToList();

and write the comparer via LINQ....

Possibly a bit overkill, but reusable, at least:

Usage first:

static class Program {
    static void Main() {
        var data = new[] {
            new { Foo = 1,Bar = "a"}, new { Foo = 2,Bar = "b"}, new {Foo = 1, Bar = "c"}
        };
        foreach (var item in data.DistinctBy(x => x.Foo))
            Console.WriteLine(item.Bar);
        }
    }
}

With utility methods:

public static class ProjectionComparer
{
    public static IEnumerable<TSource> DistinctBy<TSource,TValue>(
        this IEnumerable<TSource> source,
        Func<TSource, TValue> selector)
    {
        var comparer = ProjectionComparer<TSource>.CompareBy<TValue>(
            selector, EqualityComparer<TValue>.Default);
        return new HashSet<TSource>(source, comparer);
    }
}
public static class ProjectionComparer<TSource>
{
    public static IEqualityComparer<TSource> CompareBy<TValue>(
        Func<TSource, TValue> selector)
    {
        return CompareBy<TValue>(selector, EqualityComparer<TValue>.Default);
    }
    public static IEqualityComparer<TSource> CompareBy<TValue>(
        Func<TSource, TValue> selector,
        IEqualityComparer<TValue> comparer)
    {
        return new ComparerImpl<TValue>(selector, comparer);
    }
    sealed class ComparerImpl<TValue> : IEqualityComparer<TSource>
    {
        private readonly Func<TSource, TValue> selector;
        private readonly IEqualityComparer<TValue> comparer;
        public ComparerImpl(
            Func<TSource, TValue> selector,
            IEqualityComparer<TValue> comparer)
        {
            if (selector == null) throw new ArgumentNullException("selector");
            if (comparer == null) throw new ArgumentNullException("comparer");
            this.selector = selector;
            this.comparer = comparer;
        }

        bool IEqualityComparer<TSource>.Equals(TSource x, TSource y)
        {
            if (x == null && y == null) return true;
            if (x == null || y == null) return false;
            return comparer.Equals(selector(x), selector(y));
        }

        int IEqualityComparer<TSource>.GetHashCode(TSource obj)
        {
            return obj == null ? 0 : comparer.GetHashCode(selector(obj));
        }
    }
}
like image 22
Marc Gravell Avatar answered Nov 20 '22 06:11

Marc Gravell


You can use this extension method:

    IEnumerable<MyClass> distinctList = sourceList.DistinctBy(x => x.value1);

    public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
        this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector)
    {
        var knownKeys = new HashSet<TKey>();
        return source.Where(element => knownKeys.Add(keySelector(element)));
    }
like image 3
Jon Rea Avatar answered Nov 20 '22 05:11

Jon Rea