Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Order before Group by

I need to find the latest post for each author and then group the results so I only a single latest post for each author.

SELECT wp_posts.* FROM wp_posts         WHERE wp_posts.post_status='publish'         AND wp_posts.post_type='post'         GROUP BY wp_posts.post_author                    ORDER BY wp_posts.post_date DESC 

This is correctly grouping the output so I only get one post per author, but it is ordering the results after they have been grouped and not before they have been selected.

like image 276
Tom Avatar asked Feb 28 '11 10:02

Tom


2 Answers

select wp_posts.* from wp_posts
where wp_posts.post_status='publish'and wp_posts.post_type='post'
group by wp_posts.post_author
having wp_posts.post_date = MAX(wp_posts.post_date) /* ONLY THE LAST POST FOR EACH AUTHOR */
order by wp_posts.post_date desc


EDIT:

After some comments I have decided to add some additional informations.

The company I am working at also uses Postgres and especially SQL Server. This databases don't allow such queries. So I know that there is a other way to do this (I write a solution below). You shoud also have to know what you do if you don't group by all columns treated in the projection or use aggregate functions. Otherwise let it be!

I chose the solution above, because it's a specific question. Tom want to get the recent post for each author in a wordpress site. In my mind it is negligible for the analysis if a author do more than one post per second. Wordpress should even forbid it by its spam-double-post detection. I know from personal experience that there is a really significant benefit in performance doing a such dirty group by with MySQL. But if you know what you do, then you can do it! I have such dirty groups in apps where I'm professionally accountable for. Here I have tables with some mio rows which need 5-15s instead of 100++ seconds.

May be useful about some pros and cons: http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html


SELECT     wp_posts.* FROM      wp_posts     JOIN      (         SELECT             g.post_author             MAX(g.post_date) AS post_date         FROM wp_posts as g         WHERE             g.post_status='publish'             AND g.post_type='post'         GROUP BY g.post_author     ) as t      ON wp_posts.post_author = t.post_author AND wp_posts.post_date = t.post_date  ORDER BY wp_posts.post_date 

But if here is more then one post per second for a author you will get more then one row and not the only last one.

Now you can spin the wheel again and get the post with the highest Id. Even here it is at least not guaranteed that you really get the last one.

like image 156
edze Avatar answered Sep 23 '22 11:09

edze


Not sure if I understand your requirement correct but following inner statement gets the list of the latest post_date for each author and joins these back with the wp_posts table to get a complete record.

SELECT  * FROM    wp_posts wp         INNER JOIN (           SELECT  post_author                   , MAX(post_date) AS post_date           FROM    wp_posts           WHERE   post_status = 'publish'                   AND post_type = 'post'           GROUP BY                   post.author         ) wpmax ON wpmax.post_author = wp.post_author                    AND wpmax.post_date = wp.post_date ORDER BY         wp.post_date DESC 
like image 38
Lieven Keersmaekers Avatar answered Sep 25 '22 11:09

Lieven Keersmaekers