Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to ensure WHERE clause happens after DISTINCT?

Imagine you have a table comments in your database.

The comment table has the columns, id, text, show, comment_id_no.

If a user enters a comment, it inserts a row into the database

| id |  comment_id_no | text | show | inserted_at | | -- | -------------- | ---- | ---- | ----------- | | 1  | 1              | hi   | true | 1/1/2000    | 

If a user wants to update that comment it inserts a new row into the db

| id |  comment_id_no | text | show | inserted_at | | -- | -------------- | ---- | ---- | ----------- | | 1  | 1              | hi   | true | 1/1/2000    | | 2  | 1              | hey  | true | 1/1/2001    | 

Notice it keeps the same comment_id_no. This is so we will be able to see the history of a comment.

Now the user decides that they no longer want to display their comment

| id |  comment_id_no | text | show  | inserted_at | | -- | -------------- | ---- | ----- | ----------- | | 1  | 1              | hi   | true  | 1/1/2000    | | 2  | 1              | hey  | true  | 1/1/2001    | | 3  | 1              | hey  | false | 1/1/2002    | 

This hides the comment from the end users.

Now a second comment is made (not an update of the first)

| id |  comment_id_no | text | show  | inserted_at | | -- | -------------- | ---- | ----- | ----------- | | 1  | 1              | hi   | true  | 1/1/2000    | | 2  | 1              | hey  | true  | 1/1/2001    | | 3  | 1              | hey  | false | 1/1/2002    | | 4  | 2              | new  | true  | 1/1/2003    | 

What I would like to be able to do is select all the latest versions of unique commend_id_no, where show is equal to true. However, I do not want the query to return id=2.

Steps the query needs to take...

  1. select all the most recent, distinct comment_id_nos. (should return id=3 and id=4)
  2. select where show = true (should only return id=4)

Note: I am actually writing this query in elixir using ecto and would like to be able to do this without using the subquery function. If anyone can answer this in sql I can convert the answer myself. If anyone knows how to answer this in elixir then also feel free to answer.

like image 771
RobStallion Avatar asked Feb 14 '19 12:02

RobStallion


People also ask

Can distinct be used in WHERE clause?

By using the WHERE clause with a DISTINCT clause in MySQL queries, we are putting a condition on the basis of which MySQL returns the unique rows of the result set.

Does changing the order of WHERE clause matter in SQL?

No, that order doesn't matter (or at least: shouldn't matter). Any decent query optimizer will look at all the parts of the WHERE clause and figure out the most efficient way to satisfy that query.

Can you use distinct and ORDER BY together?

Without a transformation, a statement that contains both DISTINCT and ORDER BY would require two separate sorting steps-one to satisfy DISTINCT and one to satisfy ORDER BY. (Currently, Derby uses sorting to evaluate DISTINCT.

Why would you use a distinct clause in a query?

The SQL DISTINCT clause is used to remove duplicates from the result set of a SELECT statement.


2 Answers

You can do this without using a subquery using LEFT JOIN:

SELECT  c.id, c.comment_id_no, c.text, c.show, c.inserted_at FROM    Comments AS c         LEFT JOIN Comments AS c2             ON c2.comment_id_no = c.comment_id_no             AND c2.inserted_at > c.inserted_at WHERE   c2.id IS NULL AND     c.show = 'true'; 

I think all other approaches will require a subquery of some sort, this would usually be done with a ranking function:

SELECT  c.id, c.comment_id_no, c.text, c.show, c.inserted_at FROM    (   SELECT  c.id,                      c.comment_id_no,                      c.text,                      c.show,                      c.inserted_at,                     ROW_NUMBER() OVER(PARTITION BY c.comment_id_no                                        ORDER BY c.inserted_at DESC) AS RowNumber             FROM    Comments AS c         ) AS c WHERE   c.RowNumber = 1 AND     c.show = 'true'; 

Since you have tagged with Postgresql you could also make use of DISTINCT ON ():

SELECT  * FROM    (   SELECT  DISTINCT ON (c.comment_id_no)                      c.id, c.comment_id_no, c.text, c.show, c.inserted_at             FROM    Comments AS c              ORDER By c.comment_id_no, inserted_at DESC         ) x WHERE   show = 'true'; 

Examples on DB<>Fiddle

like image 91
GarethD Avatar answered Sep 22 '22 17:09

GarethD


I think you want:

select c.* from comments c where c.inserted_at = (select max(c2.inserted_at)                        from comments c2                        where c2.comment_id_no = c.comment_id_no                       ) and       c.show = 'true'; 

I don't understand what this has to do with select distinct. You simply want the last version of a comment, and then to check if you can show that.

EDIT:

In Postgres, I would do:

select c.* from (select distinct on (comment_id_no) c.*       from comments c       order by c.comment_id_no, c.inserted_at desc      ) c where c.show 

distinct on usually has pretty good performance characteristics.

like image 30
Gordon Linoff Avatar answered Sep 18 '22 17:09

Gordon Linoff