This is a query which selects a set of desired rows:
select max(a), b, c, d, e
from T
group by b, c, d, e;
The table has a primary key, in column id
.
I would like to identify these rows in a further query, by getting the primary key from each of those rows. How would I do that? This does not work:
select id, max(a), b, c, d, e
from T
group by b, c, d, e;
ERROR: column "T.id" must appear in the GROUP BY clause or be used in an aggregate function
I have tried this from poking around in some other postgresql questions, but no luck:
select distinct on (id) id, max(a), b, c, d, e
from T
group by b, c, d, e;
ERROR: column "T.id" must appear in the GROUP BY clause or be used in an aggregate function
What do I do? I know there can only be one id
for each result, cause it's a primary key... I literally want the primary key along with the rest of the data, for each row that the initial (working) query returns.
Essentially this means grouping by the primary key of a table results in no change in rows to that table, therefore if we group by the primary key of a table, we can call on all columns of that table with no aggregate function.
When combining the Group By and Order By clauses, it is important to bear in mind that, in terms of placement within a SELECT statement: The GROUP BY clause is placed after the WHERE clause. The GROUP BY clause is placed before the ORDER BY clause.
The PostgreSQL GROUP BY clause is used to divide rows returned by SELECT statement into different groups. The speciality of GROUP BY clause is that one can use Functions like SUM() to calculate the sum of items or COUNT() to get the total number of items in the groups.
You can use group by in a subquery, but your syntax is off.
If you don't care which id
you get then you just need to wrap your id
in some aggregate function that is guaranteed to give you a valid id
. The max
and min
aggregates come to mind:
-- Or min(id) if you want better spiritual balance.
select max(id), max(a), b, c, d, e
from T
group by b, c, d, e;
Depending on your data I think using a window function would be a better plan (thanks to evil otto for the boot to the head):
select id, a, b, c, d, e
from (
select id, a, b, c, d, e, rank() over (partition by b,c,d,e order by a desc) as r
from T
) as dt
where r = 1
By virtue of the fact that you are grouping, there can (and will likely) be more than one matched record (eg, more than one id
value) per returned record.
PostgreSQL is pretty strict - it will not guess at what you mean.
b,c,d,e
array_agg
grouping function to get an array of id
values per record.See this question: Postgresql GROUP_CONCAT equivalent?
I suggest you consider #3 as the most efficient of the possibilities.
Hope this helps. Thanks!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With