I'm trying to select only the IDs of a table that I'm querying on, and still be able to specify ordering on other columns.
First I tried simply doing:
SELECT DISTINCT countries.id
FROM countries
...
ORDER BY province_infos.population DESC, country_infos.population ASC
That won't work, because for SELECT DISTINCT
, ORDER BY
expressions must appear in select list, and returns an error.
If I add province_infos.population
and country_infos.population
, it works, but I then get duplicate IDs, which I cannot have.
To resolve this, i attempted using DISTINCT ON()
:
SELECT DISTINCT ON (countries.id)
countries.id, country_infos.population, province_infos.population
FROM countries
...
ORDER BY province_infos.population DESC, country_infos.population ASC
That then gives me the error SELECT DISTINCT ON expressions must match initial ORDER BY expressions
. I can't SELECT DISTINCT ON
a column without ordering it too.
It seems the only way for this to work, is to do something like:
SELECT DISTINCT ON (countries.id)
countries.id
FROM countries
...
ORDER BY countries.id DESC, province_infos.population DESC, country_infos.population ASC
I unfortunately can't do this, since I cannot order by IDs, as it skews the results of the other orders. And it seems the only way to not order by the IDs, is if I remove the DISTINCT
from the select, but then I'll get duplicates.
Anyone know how I can work around this?
EDIT:
The ...
I omitted shouldn't be relevant, but in case you want to see:
JOIN country_infos ON country_infos.country_refer = countries.id
JOIN languages ON languages.country_refer = countries.id
JOIN provinces ON provinces.country_refer = countries.id
JOIN province_infos ON province_infos.province_refer = provinces.id
WHERE country_infos.population > 10.3
AND languages.alphabet = 'Latin'
And I'm not just trying to get this working for this specific query. This is just an example I'm using to explain the predicament. I'm generating these kinds of queries automatically off of an arbitrary data structure.
The general answer to your question is that when using DISTINCT ON (x, ...) in SELECT statement in postgresql, the database sorts by the values in the distinct clause in order to make it easy to tell if the rows have distinct values (once they're ordered by the values, it only takes one pass for the db to remove duplicates, and it only needs to compare adjacent rows. Because of this, the db forces you to sort by the same columns in the distinct clause.
You can work around this by making your original query a subquery, like so:
SELECT t.id FROM
(SELECT DISTINCT ON (countries.id) countries.id
, province_infos.population
, country_infos.founding_date
FROM countries
...
ORDER BY countries.id, province_infos.population DESC, country_infos.founding_date ASC
)t
ORDER BY t.population DESC, T.founding_date ASC
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With