I have a messages table which looks like this:
+------------+-------------+----------+
| sender_id | created_at | message |
+------------+-------------+----------+
| 1 | 2010-06-14 | the msg |
| 1 | 2010-06-15 | the msg |
| 2 | 2010-06-16 | the msg |
| 3 | 2010-06-14 | the msg |
+------------+-------------+----------|
I want to select the single most recent message for each sender.
This seems like a GROUP BY sender_id and ORDER BY created_at but I'm having trouble getting the most recent message selected.
I'm using postgres so need an aggregate function on the created_at field in the SELECT statement if I want to order by that field so I was looking at doing something like this as an initial test
SELECT messages.sender_id, MAX(messages.created_at) as the_date
FROM messages
GROUP BY sender_id
ORDER BY the_date DESC
LIMIT 10;
This seems to work but when I want to select 'message' as well I have no idea what aggregate function to use on it. I basically just want the message that corresponds to the MAX created_at.
Is there some way of getting at this or am I approaching it the wrong way?
This:
SELECT *
FROM (
SELECT DISTINCT ON (sender_id) *
FROM messages
ORDER BY
sender_id, created_at DESC
) q
ORDER BY
created_at DESC
LIMIT 5
or this:
SELECT (mi).*
FROM (
SELECT (
SELECT mi
FROM messages mi
WHERE mi.sender_id = m.sender_id
ORDER BY
created_at DESC
LIMIT 1
) AS mi
FROM messages m
GROUP BY
sender_id
) q
ORDER BY
(mi).created_at DESC
LIMIT 5
Create an index on (sender_id, created_at)
for this to work fast.
You may find this article interesting:
Use a correlated sub query:
select * from messages m1
where m1.created_at = (
select max(m2.create_at)
from messages m2
where m1.sender_id = m2.sender_id
);
The sub query is reevaluated for each row processed by the upper query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With