Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL: How to keep rows order with DISTINCT?

The following SQL query:

SELECT messages.id, messages.created_at, comments.created_at FROM messages
LEFT JOIN comments ON comments.message_id = messages.id 
WHERE (messages.id IN (429,443)) 
ORDER BY GREATEST(messages.created_at, comments.created_at) DESC

returns:

 id         messages.created_at     comments.created_at
--------------------------------------------------------
 443                2                       5
 429                1                       4
 443                2                       3

 (I replaced dates with numbers for readability)

To get each id only once I added DISTINCT:

SELECT DISTINCT messages.id FROM messages
LEFT JOIN comments ON comments.message_id = messages.id 
WHERE (messages.id IN (429,443)) 
ORDER BY GREATEST(messages.created_at, comments.created_at) DESC

But, in the result the id values changed order:

id
---
429
443

What could be the reason for that ?

How could I keep the order ?

like image 251
Misha Moroshko Avatar asked Aug 11 '11 12:08

Misha Moroshko


People also ask

Can I use ORDER BY for distinct?

There is no way this query can be executed reasonably. Either DISTINCT doesn't work (because the added extended sort key column changes its semantics), or ORDER BY doesn't work (because after DISTINCT we can no longer access the extended sort key column).

How do I keep distinct values in SQL?

The SELECT DISTINCT statement is used to return only distinct (different) values. Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.

Does distinct SQL sort?

No. There are a number of circumstances in which a DISTINCT in Oracle does not imply a sort, the most important of which is the hashing algorithm used in 10g+ for both group by and distinct operations. Always specify ORDER BY if you want an ordered result set, even in 9i and below.

Does distinct return the first row?

It neither returns "the first row" nor the "last row". It returns the distinct values of the name column. It doesn't matter to which row they belong.


1 Answers

the distinct key word is doing what it's supposed to do, return one row each with a given column value. Distinct doesn't allow you to specify which such row will be returned, and it's clear from the original query that such an ordering is allowed (there is a row with id 443 that follows a row with id 429).

To take control of what rows will be returned, you need to reformulate the query. A typical solution I'll take is to use a group by, selecting the group column and the desired row from each group, something to the effect of

SELECT message.id, MAX(message.created_at) FROM message GROUP BY message.id;

If I need to do more, I'll use this sort of query as a subselect in a larger query, possibly joining on the id field to get more fields from the preferred row, Or ordering the query in a particular way.

like image 157
SingleNegationElimination Avatar answered Oct 18 '22 02:10

SingleNegationElimination