I have a simple table with a unit_id oid, time timestamp, diag bytea. The primary key is a combination of both time and unit_id.
The idea behind this query is to get the latest row (largest timestamp) for each unique unit_id. However the rows for each unit_id with the latest time are not always returned.
I really want to group by just the unit_id, but postgres makes me use diag also, since I am selecting that.
SELECT DISTINCT ON(unit_id) max(time) as time, diag, unit_id
FROM diagnostics.unit_diag_history
GROUP BY unit_id, diag
Answer. No, you can GROUP BY a column that was not included in the SELECT statement.
Answer: D.GROUP BY clause must contain all the columns appearing in the SELECT statement.
Any time you start thinking that you want a localized GROUP BY you should start thinking about window functions instead.
I think you're after something like this:
select unit_id, time, diag
from (
select unit_id, time, diag,
rank() over (partition by unit_id order by time desc) as rank
from diagnostics.unit_diag_history
) as dt
where rank = 1
You might want to add something to the ORDER BY to consistently break ties as well but that wouldn't alter the overall technique.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With