I have been searcing for LINQ
equivalent of WITH TIES
in sql server lately, I came across a couple things, which couldn't proove to be useful.
I know this question was asked before and has an accepted answer, but it doesn't work the way with ties does. The solution using GroupBy()
doesn't result as expected for TOP(3) WITH TIES
considering a data set consisting of {3 2 2 1 1 0}
the result set will be {3 2 2 1 1}
where it should be {3 2 2}
Using the following sample data (taken from this question):
CREATE TABLE Person
(
Id int primary key,
Name nvarchar(50),
Score float
)
INSERT INTO Person VALUES (1, 'Tom',8.9)
INSERT INTO Person VALUES (2, 'Jerry',8.9)
INSERT INTO Person VALUES (3, 'Sharti',7)
INSERT INTO Person VALUES (4, 'Mamuzi',9)
INSERT INTO Person VALUES (5, 'Kamala',9)
Traditional OrderByDescending(p => p.Score).Take(3)
will result with: Mamuzi, Kamala and one of Tom (or Jerry) where it should include BOTH
I know there is no built-in equivalent of it and i've found a way to implement it. I don't know if it is the best way to do it and open for alternative solutions.
var query = (from q in list.OrderByDescending(s => s.Score).Take(3).Select(s => s.Score).Distinct()
from i in list
where q == i.Score
select i).ToList();
Edit:
@Zefnus
I wasn't sure in which order you wanted it but to change the order you can put a OrderBy(s => s.Score) between select i and ToList()
I don't have the possibility to check what sql statement my linq clause would produce. But your answer is much better i think. And your question was also really good. I never thought about top with ties in linq. ;)
Basically it only takes top 3 scores from the first list and compares them with the whole list and i takes only those scores which are equal to the scores of the first list.
IEnumerable<T>
with anything touching a database!A solution aimed at LinqToSql and LinqToEntities should not be using IEnumerable<T>
. Your current self answer will result in every single person being selected from the database and then being queried in memory using LinqToObjects.
To make a solution that is translated to SQL and executed by the database you have to use IQueryable<T>
and Expressions instead.
public static class QueryableExtensions
{
public static IQueryable<T> TopWithTies<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
{
if (source == null) throw new ArgumentNullException("source");
if (topBy == null) throw new ArgumentNullException("topBy");
if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));
var topValues = source.OrderBy(topBy)
.Select(topBy)
.Take(topCount);
var queryableMaxMethod = typeof(Queryable).GetMethods()
.Single(mi => mi.Name == "Max" &&
mi.GetParameters().Length == 1 &&
mi.IsGenericMethod)
.MakeGenericMethod(typeof(TComparand));
var lessThanOrEqualToMaxTopValue = Expression.Lambda<Func<T, bool>>(
Expression.LessThanOrEqual(
topBy.Body,
Expression.Call(
queryableMaxMethod,
topValues.Expression)),
new[] { topBy.Parameters.Single() });
var topNRowsWithTies = source.Where(lessThanOrEqualToMaxTopValue)
.OrderBy(topBy);
return topNRowsWithTies;
}
public static IQueryable<T> TopWithTiesDescending<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
{
if (source == null) throw new ArgumentNullException("source");
if (topBy == null) throw new ArgumentNullException("topBy");
if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));
var topValues = source.OrderByDescending(topBy)
.Select(topBy)
.Take(topCount);
var queryableMinMethod = typeof(Queryable).GetMethods()
.Single(mi => mi.Name == "Min" &&
mi.GetParameters().Length == 1 &&
mi.IsGenericMethod)
.MakeGenericMethod(typeof(TComparand));
var greaterThanOrEqualToMinTopValue = Expression.Lambda<Func<T, bool>>(
Expression.GreaterThanOrEqual(
topBy.Body,
Expression.Call(queryableMinMethod,
topValues.Expression)),
new[] { topBy.Parameters.Single() });
var topNRowsWithTies = source.Where(greaterThanOrEqualToMinTopValue)
.OrderByDescending(topBy);
return topNRowsWithTies;
}
}
This creates queries of the following form:
SELECT [t0].[Id], [t0].[Name], [t0].[Score]
FROM [Person] AS [t0]
WHERE [t0].[Score] >= ((
SELECT MIN([t2].[Score])
FROM (
SELECT TOP (3) [t1].[Score]
FROM [Person] AS [t1]
ORDER BY [t1].[Score] DESC
) AS [t2]
))
ORDER BY [t0].[Score] DESC
That query is only about 50% worse than the baseline query:
SELECT TOP (3) WITH TIES
[t0].[Id],
[t0].[Name],
[t0].[Score]
FROM
[Person] AS [t0]
ORDER BY [t0].[Score] desc
With a data set consisting of your original 5 records and an additional 10000 records all with scores less than the original both of these are more or less instant (less than 20 milliseconds).
The IEnumerable<T>
approach took a whole 2 minutes!
If the expression building and reflection seems scary the same thing can be achieved with a join:
public static IQueryable<T> TopWithTiesDescendingJoin<T, TComparand>(this IQueryable<T> source, Expression<Func<T, TComparand>> topBy, int topCount)
{
if (source == null) throw new ArgumentNullException("source");
if (topBy == null) throw new ArgumentNullException("topBy");
if (topCount < 1) throw new ArgumentOutOfRangeException("topCount", string.Format("topCount must be greater than 0, was {0}", topCount));
var orderedByValue = source.OrderByDescending(topBy);
var topNValues = orderedByValue.Select(topBy).Take(topCount).Distinct();
var topNRowsWithTies = topNValues.Join(source, value => value, topBy, (x, row) => row);
return topNRowsWithTies.OrderByDescending(topBy);
}
With the following query as the result (with about the same performance):
SELECT [t3].[Id], [t3].[Name], [t3].[Score]
FROM (
SELECT DISTINCT [t1].[Score]
FROM (
SELECT TOP (3) [t0].[Score]
FROM [Person] AS [t0]
ORDER BY [t0].[Score] DESC
) AS [t1]
) AS [t2]
INNER JOIN [Person] AS [t3] ON [t2].[Score] = [t3].[Score]
ORDER BY [t3].[Score] DESC
Another solution - which probably is not as efficient as the other solution - is to get TOP(3)
Scores and get the rows with Score values contained in the TOP(3)
.
We can use Contains()
as follows;
orderedPerson = datamodel.People.OrderByDescending(p => p.Score);
topPeopleList =
(
from p in orderedPerson
let topNPersonScores = orderedPerson.Take(n).Select(p => p.Score).Distinct()
where topNPersonScores.Contains(p.Score)
select p
).ToList();
What's good about this implementation is that it's extension method TopWithTies()
can be implemented easly as;
public static IEnumerable<T> TopWithTies<T, TResult>(this IEnumerable<T> enumerable, Func<T, TResult> selector, int n)
{
IEnumerable<T> orderedEnumerable = enumerable.OrderByDescending(selector);
return
(
from p in orderedEnumerable
let topNValues = orderedEnumerable.Take(n).Select(selector).Distinct()
where topNValues.Contains(selector(p))
select p
);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With