C# Linq Distinct() method removes the duplicate elements from a sequence (list) and returns the distinct elements from a single data source. It comes under the Set operators' category in LINQ query operators, and the method works the same way as the DISTINCT directive in Structured Query Language (SQL).
LINQ Distinct is not that smart when it comes to custom objects. All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields). One workaround is to implement the IEquatable interface as shown here.
distinct in Linq to get result based on one field of the table (so do not require a whole duplicated records from table). I know writing basic query using distinct as followed: var query = (from r in table1 orderby r. Text select r).
Are you trying to be distinct by more than one field? If so, just use an anonymous type and the Distinct operator and it should be okay:
var query = doc.Elements("whatever")
.Select(element => new {
id = (int) element.Attribute("id"),
category = (int) element.Attribute("cat") })
.Distinct();
If you're trying to get a distinct set of values of a "larger" type, but only looking at some subset of properties for the distinctness aspect, you probably want DistinctBy
as implemented in MoreLINQ in DistinctBy.cs
:
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer)
{
HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
foreach (TSource element in source)
{
if (knownKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
(If you pass in null
as the comparer, it will use the default comparer for the key type.)
Just use the Distinct()
with your own comparer.
http://msdn.microsoft.com/en-us/library/bb338049.aspx
In addition to Jon Skeet's answer, you can also use the group by expressions to get the unique groups along w/ a count for each groups iterations:
var query = from e in doc.Elements("whatever")
group e by new { id = e.Key, val = e.Value } into g
select new { id = g.Key.id, val = g.Key.val, count = g.Count() };
For any one still looking; here's another way of implementing a custom lambda comparer.
public class LambdaComparer<T> : IEqualityComparer<T>
{
private readonly Func<T, T, bool> _expression;
public LambdaComparer(Func<T, T, bool> lambda)
{
_expression = lambda;
}
public bool Equals(T x, T y)
{
return _expression(x, y);
}
public int GetHashCode(T obj)
{
/*
If you just return 0 for the hash the Equals comparer will kick in.
The underlying evaluation checks the hash and then short circuits the evaluation if it is false.
Otherwise, it checks the Equals. If you force the hash to be true (by assuming 0 for both objects),
you will always fall through to the Equals check which is what we are always going for.
*/
return 0;
}
}
you can then create an extension for the linq Distinct that can take in lambda's
public static IEnumerable<T> Distinct<T>(this IEnumerable<T> list, Func<T, T, bool> lambda)
{
return list.Distinct(new LambdaComparer<T>(lambda));
}
Usage:
var availableItems = list.Distinct((p, p1) => p.Id== p1.Id);
I'm a bit late to the answer, but you may want to do this if you want the whole element, not only the values you want to group by:
var query = doc.Elements("whatever")
.GroupBy(element => new {
id = (int) element.Attribute("id"),
category = (int) element.Attribute("cat") })
.Select(e => e.First());
This will give you the first whole element matching your group by selection, much like Jon Skeets second example using DistinctBy, but without implementing IEqualityComparer comparer. DistinctBy will most likely be faster, but the solution above will involve less code if performance is not an issue.
// First Get DataTable as dt
// DataRowComparer Compare columns numbers in each row & data in each row
IEnumerable<DataRow> Distinct = dt.AsEnumerable().Distinct(DataRowComparer.Default);
foreach (DataRow row in Distinct)
{
Console.WriteLine("{0,-15} {1,-15}",
row.Field<int>(0),
row.Field<string>(1));
}
Since we are talking about having every element exactly once, a "set" makes more sense to me.
Example with classes and IEqualityComparer implemented:
public class Product
{
public int Id { get; set; }
public string Name { get; set; }
public Product(int x, string y)
{
Id = x;
Name = y;
}
}
public class ProductCompare : IEqualityComparer<Product>
{
public bool Equals(Product x, Product y)
{ //Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
//Check whether the products' properties are equal.
return x.Id == y.Id && x.Name == y.Name;
}
public int GetHashCode(Product product)
{
//Check whether the object is null
if (Object.ReferenceEquals(product, null)) return 0;
//Get hash code for the Name field if it is not null.
int hashProductName = product.Name == null ? 0 : product.Name.GetHashCode();
//Get hash code for the Code field.
int hashProductCode = product.Id.GetHashCode();
//Calculate the hash code for the product.
return hashProductName ^ hashProductCode;
}
}
Now
List<Product> originalList = new List<Product> {new Product(1, "ad"), new Product(1, "ad")};
var setList = new HashSet<Product>(originalList, new ProductCompare()).ToList();
setList
will have unique elements
I thought of this while dealing with .Except()
which returns a set-difference
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With