Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DISTINCT() and ORDERBY issue

I am learning about LINQ-to-SQL and everything was going well until something strange happened:

I tried to make an example of distinct, so, using the Northwind dabatase I wrote the following query:

var query =      from o in db.Orders     orderby o.CustomerID     select new     {         o.CustomerID     }; 

If I print the SQL generated by LINQ-to-SQL for the query stored in query it looks like this:

SELECT [t0].[CustomerID] FROM [dbo].[Orders] AS [t0] ORDER BY [t0].[CustomerID] 

So, as usual, the query brings all the CustomerID for each Order in the Orders table ordered alphabetically.

But! If I use the Distinct() method like this:

var query = (     from o in db.Orders     orderby o.CustomerID     select new     {         o.CustomerID     }).Distinct(); 

The query brings the expected results of the Distinct clause, but the CustomerIDs are not ordered despite I wrote orderby o.CustomerID!

The SQL query for this second LINQ query is the following:

SELECT DISTINCT [t0].[CustomerID] FROM [dbo].[Orders] AS [t0] 

As we can see **the ORDER BY clause is missing. Why is that?

Why does the ORDER BY clause disappears when I use the Distinct() method?

like image 820
Dante Avatar asked Sep 14 '12 17:09

Dante


People also ask

Can you use distinct and ORDER BY together?

Without a transformation, a statement that contains both DISTINCT and ORDER BY would require two separate sorting steps-one to satisfy DISTINCT and one to satisfy ORDER BY. (Currently, Derby uses sorting to evaluate DISTINCT.

Does distinct affect performance?

Yes, the application needs to compare every record to the "distinct" records cache as it goes. You can improve performance by using an index, particularly on the numeric and date fields. The operation you describe is O(n²).

Does distinct SQL sort?

No. There are a number of circumstances in which a DISTINCT in Oracle does not imply a sort, the most important of which is the hashing algorithm used in 10g+ for both group by and distinct operations. Always specify ORDER BY if you want an ordered result set, even in 9i and below.

Can we use distinct with count (*)?

Yes, you can use COUNT() and DISTINCT together to display the count of only distinct rows.


1 Answers

From the Queryable.Distinct documentation;

The expected behavior is that it returns an unordered sequence of the unique items in source.

In other words, any order the existing IQueryable has is lost when you use Distinct() on it.

What you want is probably something more like this, an OrderBy() after the Distinct() is done;

var query = (from o in db.Orders              select new              {                  o.CustomerID              }).Distinct().OrderBy(x => x.CustomerID); 
like image 200
Joachim Isaksson Avatar answered Sep 28 '22 16:09

Joachim Isaksson