Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reason for - ORDER BY items must appear in the select list if SELECT DISTINCT is specified

Tags:

sql

sql-server

I know that the query below causes the error - ORDER BY items must appear in the select list if SELECT DISTINCT is specified.

SELECT DISTINCT city
FROM HR.Employees
WHERE country = N'USA' AND region = N'WA'
ORDER BY birthdate;

What is the reason for this ? What is actually happening ? If I don't use DISTINCT or just add birthdate in SELECT or ORDER BY city only, it gives me some output, but not an error. Is it because SELECT DISTINCT city only gives a result set with cities and nothing else ?

EDIT -(I think this may be an answer to my question)

SELECT city
FROM HR.Employees
WHERE country = N'USA' AND region = N'WA'

Consider the above query. Not sure, but I think this is how it works - Behind the scenes, SQL Server probably has a result set with all columns, but we are only shown the city column which I will call the "displayed set" . So, ordering by city or any other column is valid.

What happens to the result set when we use SELECT DISTINCT city instead ? SQL server does not have only the rows with DISTINCT cities in its result set. It has the entire result set, like the one generated by a SELECT * query. But, it shows only distinct rows based on city. Now can this displayed set be sorted based on birthdate ? No.

In a city with many employees, ie many birthdates, SQL server cannot tell which birthdate should be used for ordering the displayed set. That is why it displays an error message.

like image 614
Steam Avatar asked Sep 03 '13 05:09

Steam


2 Answers

A query with SELECT DISTINCT can be rewritten using GROUP BY. So the query:

SELECT DISTINCT city
FROM HR.Employees
WHERE country = N'USA' AND region = N'WA' ;

is equivalent to:

SELECT city
FROM HR.Employees
WHERE country = N'USA' AND region = N'WA'
GROUP BY city ;

and you can't use ORDER BY birthdate here either. The reason is the same for both queries. There may be many (more than one) rows with same city but different birthdate. Which one should be used for the ordering (if it was allowed?)

You can however use aggregate functions with a GROUP BY query:

SELECT city
FROM HR.Employees
WHERE country = N'USA' AND region = N'WA'
GROUP BY city 
ORDER BY MIN(birthdate) ;               -- or MAX(birthdate)
like image 132
ypercubeᵀᴹ Avatar answered Oct 04 '22 19:10

ypercubeᵀᴹ


try this: 
select city 
    from (
    SELECT city, min(birthdate) as birthdate
    FROM HR.Employees
    WHERE country = N'USA' AND region = N'WA'
    group by city
          ) as t
    order by birthdate
like image 41
xurca Avatar answered Oct 04 '22 18:10

xurca