Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use DISTINCT and ORDER BY in same SELECT statement?

People also ask

Can you use distinct and ORDER BY together?

All titles are distinct. There is no way this query can be executed reasonably. Either DISTINCT doesn't work (because the added extended sort key column changes its semantics), or ORDER BY doesn't work (because after DISTINCT we can no longer access the extended sort key column).

Can we use distinct and GROUP BY in same query?

Well, GROUP BY and DISTINCT have their own use. GROUP BY cannot replace DISTINCT in some situations and DISTINCT cannot take place of GROUP BY. It is as per your choice and situation how you are optimizing both of them and choosing where to use GROUP BY and DISTINCT.

How do I use distinct keyword ORDER BY in SQL?

The SQL SELECT DISTINCT StatementThe SELECT DISTINCT statement is used to return only distinct (different) values. Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.

What is a key difference between the distinct and ORDER BY statements in SQL SELECT commands?

AND CUSTNUM>1000; What is a key difference between the DISTINCT and ORDER BY statements, in SQL SELECT commands? ORDER BY modifies the presentation of data results and DISTINCT filters data results. Which SQL statement alphabetizes customer names within the same satellite-office city?


The problem is that the columns used in the ORDER BY aren't specified in the DISTINCT. To do this, you need to use an aggregate function to sort on, and use a GROUP BY to make the DISTINCT work.

Try something like this:

SELECT DISTINCT Category, MAX(CreationDate) 
FROM MonitoringJob 
GROUP BY Category 
ORDER BY MAX(CreationDate) DESC, Category

Extended sort key columns

The reason why what you want to do doesn't work is because of the logical order of operations in SQL, which, for your first query, is (simplified):

  • FROM MonitoringJob
  • SELECT Category, CreationDate i.e. add a so called extended sort key column
  • ORDER BY CreationDate DESC
  • SELECT Category i.e. remove the extended sort key column again from the result.

So, thanks to the SQL standard extended sort key column feature, it is totally possible to order by something that is not in the SELECT clause, because it is being temporarily added to it behind the scenes.

So, why doesn't this work with DISTINCT?

If we add the DISTINCT operation, it would be added between SELECT and ORDER BY:

  • FROM MonitoringJob
  • SELECT Category, CreationDate
  • DISTINCT
  • ORDER BY CreationDate DESC
  • SELECT Category

But now, with the extended sort key column CreationDate, the semantics of the DISTINCT operation has been changed, so the result will no longer be the same. This is not what we want, so both the SQL standard, and all reasonable databases forbid this usage.

Workarounds

It can be emulated with standard syntax as follows

SELECT Category
FROM (
  SELECT Category, MAX(CreationDate) AS CreationDate
  FROM MonitoringJob
  GROUP BY Category
) t
ORDER BY CreationDate DESC

Or, just simply (in this case), as shown also by Prutswonder

SELECT Category, MAX(CreationDate) AS CreationDate
FROM MonitoringJob
GROUP BY Category
ORDER BY CreationDate DESC

I have blogged about SQL DISTINCT and ORDER BY more in detail here.


If the output of MAX(CreationDate) is not wanted - like in the example of the original question - the only answer is the second statement of Prashant Gupta's answer:

SELECT [Category] FROM [MonitoringJob] 
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

Explanation: you can't use the ORDER BY clause in an inline function, so the statement in the answer of Prutswonder is not useable in this case, you can't put an outer select around it and discard the MAX(CreationDate) part.


Just use this code, If you want values of [Category] and [CreationDate] columns

SELECT [Category], MAX([CreationDate]) FROM [MonitoringJob] 
             GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

Or use this code, If you want only values of [Category] column.

SELECT [Category] FROM [MonitoringJob] 
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

You'll have all the distinct records what ever you want.