Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improving PostgreSQL Aggregate Performance

What's the best way to increase the speed of a query in PostgreSQL that's performing a MAX(id) aggregation?

I have a modest number of records associated with an id, which I can COUNT() in a second e.g.

select count(id) as cnt from mytable where ref_id=2660

row   cnt
1     2844

However, when I try and find the most recent record id using MAX(), the query takes nearly 5 minutes.

select max(id) as id from mytable where ref_id=2660

This is surprising, because I've otherwise found PG surprisingly fast with much more complicated queries. Why would there be such a difference in the query times, especially for such a relatively small number of records? What would be the best way to improve this performance?

EDIT: This is the query plan for the above MAX() select:

"Result  (cost=219.84..219.85 rows=1 width=0)"
"  InitPlan 1 (returns $0)"
"    ->  Limit  (cost=0.00..219.84 rows=1 width=4)"
"          ->  Index Scan Backward using mytable_pkey on mytable  (cost=0.00..773828.42 rows=3520 width=4)"
"                Filter: ((id IS NOT NULL) AND (ref_id = 2660))"
like image 274
Cerin Avatar asked Feb 20 '11 21:02

Cerin


1 Answers

I googled around, seems like PostgreSQL (up to 8.4) doesn't like MAX and MIN, it does a sequential scan of the table to get the result. It's hard to say that it's your case without the query plan and the version.

You can try this workaround.

SELECT id from mytable WHERE ref_id=2660 ORDER BY id DESC LIMIT 1

Edit: Make sure you have an index with (ref_id, id), otherwise a table scan/sort is inevitable.

like image 181
arthurprs Avatar answered Nov 14 '22 23:11

arthurprs