PostgreSQL MAX and GROUP BY

Question

I have a table with id, year and count.

I want to get the MAX(count) for each id and keep the year when it happens, so I make this query:

SELECT id, year, MAX(count) FROM table GROUP BY id;

Unfortunately, it gives me an error:

ERROR: column "table.year" must appear in the GROUP BY clause or be used in an aggregate function

So I try:

SELECT id, year, MAX(count) FROM table GROUP BY id, year;

But then, it doesn't do MAX(count), it just shows the table as it is. I suppose because when grouping by year and id, it gets the max for the id of that specific year.

So, how can I write that query? I want to get the id´s MAX(count) and the year when that happens.

Erwin Brandstetter · Accepted Answer

The shortest (and possibly fastest) query would be with DISTINCT ON, a PostgreSQL extension of the SQL standard DISTINCT clause:

SELECT DISTINCT ON (1)        id, count, year FROM   tbl ORDER  BY 1, 2 DESC, 3;

The numbers refer to ordinal positions in the SELECT list. You can spell out column names for clarity:

SELECT DISTINCT ON (id)        id, count, year FROM   tbl ORDER  BY id, count DESC, year;

The result is ordered by id etc. which may or may not be welcome. It's better than "undefined" in any case.

It also breaks ties (when multiple years share the same maximum count) in a well defined way: pick the earliest year. If you don't care, drop year from the ORDER BY. Or pick the latest year with year DESC.

For many rows per id, other query techniques are (much) faster. See:

Select first row in each GROUP BY group?
Optimize GROUP BY query to retrieve latest row per user

PostgreSQL MAX and GROUP BY

Tags:

sql

postgresql

group-by

greatest-n-per-group

max

Project Dumbo Dev

1 Answers

Erwin Brandstetter

Recent Activity

Donate For Us

PostgreSQL MAX and GROUP BY

Tags:

sql

postgresql

group-by

greatest-n-per-group

max

Project Dumbo Dev

1 Answers

Erwin Brandstetter

Related questions

Recent Activity

Donate For Us