I wanted to utilize the power of index-only scans in Postgres and experimented with one table:
CREATE TABLE dest.contexts
(
id integer NOT NULL,
phrase_id integer NOT NULL,
lang character varying(5) NOT NULL,
ranking_value double precision,
index_min integer,
index_max integer,
case_sensitive boolean,
is_enabled boolean,
is_to_sync boolean NOT NULL DEFAULT true
);
insert into dest.contexts select * from source.contexts;
alter table dest.contexts
add constraint pk_contexts primary key (id, phrase_id, lang);
CREATE INDEX idx_contexts_
ON dest.contexts
USING btree
(id, is_enabled, lang, phrase_id, ranking_value, index_min, index_max, case_sensitive);
The index covers all columns I want to use in the next query:
explain analyze
select ranking_value, index_min, index_max, case_sensitive
from dest.contexts
where id = 456 and is_enabled
I check the plan immediately after creation:
Bitmap Heap Scan on contexts (cost=4.41..31.46 rows=12 width=17) (actual time=0.045..0.045 rows=0 loops=1)
Recheck Cond: (id = 456)
Filter: is_enabled
-> Bitmap Index Scan on idx_contexts_ (cost=0.00..4.40 rows=12 width=0) (actual time=0.038..0.038 rows=0 loops=1)
Index Cond: ((id = 456) AND (is_enabled = true))
Planning time: 0.631 ms
Execution time: 0.093 ms
It is strange, but OK...
In several seconds it changes to another (autovacuum?)
Index Scan using pk_contexts on contexts (cost=0.28..17.93 rows=6 width=17) (actual time=0.027..0.027 rows=0 loops=1)
Index Cond: (id = 456)
Filter: is_enabled
Planning time: 0.185 ms
Execution time: 0.070 ms
I try to force it to use index-only scan:
analyze dest.contexts
But it does not change anything. Then I do
vacuum verbose analyze dest.contexts;
INFO: vacuuming "dest.contexts"
INFO: index "pk_contexts" now contains 4845 row versions in 21 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "idx_contexts_" now contains 4845 row versions in 37 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "contexts": found 0 removable, 4845 nonremovable row versions in 41 out of 41 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "dest.contexts"
INFO: "contexts": scanned 41 of 41 pages, containing 4845 live rows and 0 dead rows; 4845 rows in sample, 4845 estimated total rows
And here I finally get what I want:
Index Only Scan using idx_contexts_ on contexts (cost=0.28..4.40 rows=6 width=17) (actual time=0.014..0.014 rows=0 loops=1)
Index Cond: ((id = 456) AND (is_enabled = true))
Filter: is_enabled
Heap Fetches: 0
Planning time: 0.247 ms
Execution time: 0.052 ms
So here are the questions:
analyze
teach it to use the large 'all-covering' index?vacuum analyze
do it?vacuum
do anything at all? To my mind there is nothing to vacuum there.When a vacuum process runs, the space occupied by these dead tuples is marked reusable by other tuples. An “analyze” operation does what its name says – it analyzes the contents of a database's tables and collects statistics about the distribution of values in each column of every table.
Does a vacuum analyze lock tables ? No, it's the "FULL VACUUM" command that locks tables.
EXPLAIN ANALYZE will actually run the query, so be careful with updates or deletes! In those cases, consider not using ANALYZE, or you might perhaps wrap the entire statement in a transaction that you can roll back.
PostgreSQL's VACUUM command has to process each table on a regular basis for several reasons: To recover or reuse disk space occupied by updated or deleted rows. To update data statistics used by the PostgreSQL query planner. To update the visibility map, which speeds up index-only scans.
Analyze is responsible for collecting statistics, and analyze have effects on selection of query plan and plan costs.
Indexes do not contain visibility information about the tuples. Because of that, reading data pages and filtering on resultset is also applied event though all columns in select is in the index. If all tuples visible in a page, Postgres does not need additional filtering operations. Postgres uses visibility map (vm) to achieve that.
Vacuum uses and also updates visibility map while reclaiming dead tuples. Because of that, vacuum changes query plan for using index-only scan if possible.
https://www.postgresql.org/docs/9.6/static/sql-vacuum.html
https://www.postgresql.org/docs/9.6/static/storage-vm.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With