I am learning about LINQ-to-SQL and everything was going well until something strange happened:
I tried to make an example of distinct
, so, using the Northwind dabatase I wrote the following query:
var query = from o in db.Orders orderby o.CustomerID select new { o.CustomerID };
If I print the SQL generated by LINQ-to-SQL for the query stored in query
it looks like this:
SELECT [t0].[CustomerID] FROM [dbo].[Orders] AS [t0] ORDER BY [t0].[CustomerID]
So, as usual, the query brings all the CustomerID
for each Order
in the Orders
table ordered alphabetically.
But! If I use the Distinct()
method like this:
var query = ( from o in db.Orders orderby o.CustomerID select new { o.CustomerID }).Distinct();
The query brings the expected results of the Distinct
clause, but the CustomerID
s are not ordered despite I wrote orderby o.CustomerID
!
The SQL query for this second LINQ query is the following:
SELECT DISTINCT [t0].[CustomerID] FROM [dbo].[Orders] AS [t0]
As we can see **the ORDER BY
clause is missing. Why is that?
Why does the ORDER BY
clause disappears when I use the Distinct()
method?
Without a transformation, a statement that contains both DISTINCT and ORDER BY would require two separate sorting steps-one to satisfy DISTINCT and one to satisfy ORDER BY. (Currently, Derby uses sorting to evaluate DISTINCT.
Yes, the application needs to compare every record to the "distinct" records cache as it goes. You can improve performance by using an index, particularly on the numeric and date fields. The operation you describe is O(n²).
No. There are a number of circumstances in which a DISTINCT in Oracle does not imply a sort, the most important of which is the hashing algorithm used in 10g+ for both group by and distinct operations. Always specify ORDER BY if you want an ordered result set, even in 9i and below.
Yes, you can use COUNT() and DISTINCT together to display the count of only distinct rows.
From the Queryable.Distinct documentation;
The expected behavior is that it returns an unordered sequence of the unique items in source.
In other words, any order the existing IQueryable has is lost when you use Distinct()
on it.
What you want is probably something more like this, an OrderBy() after the Distinct() is done;
var query = (from o in db.Orders select new { o.CustomerID }).Distinct().OrderBy(x => x.CustomerID);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With