We have a table of 180m rows, 20 GB in size. Table DDL is:
create table app.table
(
a_id integer not null,
b_id integer not null,
c_id integer not null,
d_id integer not null,
e_id integer not null,
f_id integer not null,
a_date timestamp not null,
date_added timestamp,
last_date_modified timestamp default now()
);
Value distributions:
The primary key is a composite key:
alter table app.table add constraint table_pk primary key (a_id, b_id, c_id, d_id, e_id);
We're running a r6g.xlarge cluster in Aurora PostgreSQL v12.8. It's one instance with no other traffic hitting it. We've ran ANALYZE and VACUUM ANALYZE against the table:
INFO: "table": scanned 30000 of 1711284 pages, containing 3210000 live
rows and 0 dead rows; 30000 rows in sample, 183107388 estimated total rows
Problem
This query takes 9 seconds to run when shared_buffers is cold (or as cold as we can get it):
select a_id, b_id, c_id, d_id, a_date
from app.table ts
where a_id in ( <5000 values> )
and b_id = 34
and c_id in (2,3)
and d_id = 0
EXPLAIN output:
Index Scan using table_pk on table ts (cost=0.57..419134.91 rows=237802 width=24) (actual time=8.335..9803.424 rows=5726 loops=1)
" Index Cond: ((a_id = ANY ('{66986803,90478329,...,121697593}'::integer[])) AND (b_id = 34))"
" Filter: (c_id = ANY ('{2,3}'::integer[])))"
Rows Removed by Filter: 3
Buffers: shared hit=12610 read=10593
I/O Timings: read=9706.055
Planning:
Buffers: shared hit=112 read=29
I/O Timings: read=29.227
Planning Time: 33.437 ms
Execution Time: 9806.271 ms
We think this is unreasonably slow. When the query is ran again, and thus comes from cache, the time it takes is 25 ms. We'd rather not prewarm if possible.
In any case, we'd rather have better performance for this sort of query, around the 1-2 second mark if possible. Any ideas on how we could improve the performance?
EDIT - Effect of adding a covering index:
Tried adding a covering index to include the "a_date":
create unique index covering_idx on app.table (a_id, b_id, c_id, d_id, e_id) include (a_date)
EXPLAIN results after re-running the query (with cold shared_buffers cache):
Index Only Scan using covering_idx on table ts (cost=0.57..28438.58 rows=169286 width=24) (actual time=8.020..7028.442 rows=5658 loops=1)
Index Cond: ((a_id = ANY ('{134952505,150112033,…,42959574}'::integer[])) AND (b_id = 34))
Filter: ((e_id = ANY ('{0,0}'::integer[])) AND (c_id = ANY ('{2,3}'::integer[])))
Rows Removed by Filter: 2
Heap Fetches: 0
Buffers: shared hit=12353 read=7733
I/O Timings: read=6955.935
Planning:
Buffers: shared hit=80 read=8
I/O Timings: read=8.458
Planning Time: 11.930 ms
Execution Time: 7031.054 ms
Effect when using Bitmap Heap Scan vs. Index Scan:
We've discovered that we get a speed up when the query is executed using a Bitmap Heap Scan, rather than an Index Scan. We found this by forcing the plan using pg_hint_plan:
When adding /*+ BitmapScan(table) */:
Bitmap Heap Scan on table ts (cost=22912.96..60160.79 rows=9842 width=24) (actual time=3972.237..4063.417 rows=5657 loops=1)
Recheck Cond: ((a_id = ANY ('{24933126,19612702,27100661,73628268,...,150482461}'::integer[])) AND (b_id = 34))
Filter: ((d_id = ANY ('{0,0}'::integer[])) AND (c_id = ANY ('{2,3}'::integer[])))
Rows Removed by Filter: 4
Heap Blocks: exact=5644
Buffers: shared hit=14526 read=11136
I/O Timings: read=22507.527
-> Bitmap Index Scan on table_pk (cost=0.00..22898.00 rows=9842 width=0) (actual time=3969.920..3969.920 rows=5661 loops=1)
Index Cond: ((a_id = ANY ('{24933126,19612702,27100661,,150482461}'::integer[])) AND (b_id = 34))
Buffers: shared hit=14505 read=5513
I/O Timings: read=3923.878
Planning:
Buffers: shared hit=6718
Planning Time: 21.493 ms
{Execution Time: 4066.582 ms
Currently, we are thinking of forcing this plan in production using pg_hint_plan - but we'd rather know why the planner is opting for a less optimal plan! We have run VACUUM ANALYZE with default_statistics_target of 1000.
You are trying to optimize query performance on cold cache.
It's one instance with no other traffic hitting it. We've ran
ANALYZEandVACUUM ANALYZEagainst the table
(Aside, ANALYZE alone adds nothing over VACUUM ANALYZE, so that's redundant.)
To optimize, minimize the number of data pages that have to be read. So ...
...decrease the storage size per row if possible. (With index-only scans, that's mostly just important for the involved index.)
... increase data locality: more qualifying tuples in the same data page means fewer pages to read.
You should get some improvement from simply re-ordering columns in your PK. You now have:
primary key (a_id, b_id, c_id, d_id, e_id)
With leading a_id. Index tuples for distinct a_id are spread out as much as possible. Exactly what your query does not need. You disclosed:
b_id has one value [...]
d_id has one value (currently)
e_id has one value (currently)
c_id has a range of 0-4
a_id has a range of 0-160,000,000
Reorder columns like this to maximize locality for your query:
ALTER TABLE app.table ADD CONSTRAINT table_pk PRIMARY KEY (b_id, d_id, e_id, c_id, a_id) INCLUDE (a_date);
(Plus, add INCLUDE (a_date) to allow index-only scans, as has been suggested already.)
Since b_id, and d_id / e_id (currently) are constants, those are just noise / ballast. The important part is to move c_id before d_id, this way, we never touch branches of the index with c_id IN (0,1,4), and more of our tuples end up on fewer index pages. It's a mild effect, since we seem to use like half of the spectrum anyway.
Since b_id is a constant, it shouldn't water down the PK to begin with. The same is true for d_id and e_id if they actually remain constants.
And we don't need e_id for our query at all.
This adapted query:
SELECT a_id, 34 AS b_id, c_id, 0 AS d_id, a_date
FROM app.table ts
WHERE c_id IN (2,3)
AND a_id IN ( < 5000 VALUES > )
.. in combination with this index would be much better:
CREATE INDEX foo ON app.table (c_id, d_id) INCLUDE (a_date)
Probably better, yet:
SELECT a_id, 34 AS b_id, 2 AS c_id, 0 AS d_id, a_date
FROM app.table ts
WHERE c_id = 2
AND a_id IN ( < 5000 VALUES > )
UNION ALL
SELECT a_id, 34 AS b_id, 3 AS c_id, 0 AS d_id, a_date
FROM app.table ts
WHERE c_id = 3
AND a_id IN ( < 5000 VALUES > )
This allows index-only scans with only index conditions (Index Cond: in the query plan) and no filter (Filter:), for minimum cost.
Or even partial indexes for the last query:
CREATE INDEX foo_c2 ON app.table (d_id) INCLUDE (a_date) WHERE c_id = 2;
CREATE INDEX foo_c3 ON app.table (d_id) INCLUDE (a_date) WHERE c_id = 3;
Eliminates irrelevant rows from the tailored index(es) and makes index access slightly cheaper, yet.
Also, consider upgrading to Postgres 13 or Postgres 14 - with improvements for big data.
Consider the bottom part of the manual page "Index-Only Scans and Covering Indexes" for this!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With