Why does vacuum analyze change query plan while analyze does not?

Tags:

I wanted to utilize the power of index-only scans in Postgres and experimented with one table:

CREATE TABLE dest.contexts
(
  id integer NOT NULL,
  phrase_id integer NOT NULL,
  lang character varying(5) NOT NULL,
  ranking_value double precision,
  index_min integer,
  index_max integer,
  case_sensitive boolean,
  is_enabled boolean,
  is_to_sync boolean NOT NULL DEFAULT true
);

insert into dest.contexts select * from source.contexts;

alter table dest.contexts
  add constraint pk_contexts primary key (id, phrase_id, lang);

 CREATE INDEX idx_contexts_
  ON dest.contexts
  USING btree
  (id, is_enabled, lang, phrase_id, ranking_value, index_min, index_max, case_sensitive);

The index covers all columns I want to use in the next query:

explain analyze
select ranking_value, index_min, index_max, case_sensitive
from dest.contexts
where id = 456 and is_enabled

I check the plan immediately after creation:

Bitmap Heap Scan on contexts  (cost=4.41..31.46 rows=12 width=17) (actual time=0.045..0.045 rows=0 loops=1)
  Recheck Cond: (id = 456)
  Filter: is_enabled
  ->  Bitmap Index Scan on idx_contexts_  (cost=0.00..4.40 rows=12 width=0) (actual time=0.038..0.038 rows=0 loops=1)
        Index Cond: ((id = 456) AND (is_enabled = true))
Planning time: 0.631 ms
Execution time: 0.093 ms

It is strange, but OK...

In several seconds it changes to another (autovacuum?)

Index Scan using pk_contexts on contexts  (cost=0.28..17.93 rows=6 width=17) (actual time=0.027..0.027 rows=0 loops=1)
  Index Cond: (id = 456)
  Filter: is_enabled
Planning time: 0.185 ms
Execution time: 0.070 ms

I try to force it to use index-only scan:

analyze dest.contexts

But it does not change anything. Then I do

vacuum verbose analyze dest.contexts;

INFO:  vacuuming "dest.contexts"
INFO:  index "pk_contexts" now contains 4845 row versions in 21 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "idx_contexts_" now contains 4845 row versions in 37 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "contexts": found 0 removable, 4845 nonremovable row versions in 41 out of 41 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "dest.contexts"
INFO:  "contexts": scanned 41 of 41 pages, containing 4845 live rows and 0 dead rows; 4845 rows in sample, 4845 estimated total rows

And here I finally get what I want:

Index Only Scan using idx_contexts_ on contexts  (cost=0.28..4.40 rows=6 width=17) (actual time=0.014..0.014 rows=0 loops=1)
  Index Cond: ((id = 456) AND (is_enabled = true))
  Filter: is_enabled
  Heap Fetches: 0
Planning time: 0.247 ms
Execution time: 0.052 ms

So here are the questions:

Why does not analyze teach it to use the large 'all-covering' index?
Why does vacuum analyze do it?
My table was filled from the scratch with one large insert. Why does vacuum do anything at all? To my mind there is nothing to vacuum there.

437

asked Sep 01 '17 16:09

greatvovan

1 Answers

Analyze is responsible for collecting statistics, and analyze have effects on selection of query plan and plan costs.

Indexes do not contain visibility information about the tuples. Because of that, reading data pages and filtering on resultset is also applied event though all columns in select is in the index. If all tuples visible in a page, Postgres does not need additional filtering operations. Postgres uses visibility map (vm) to achieve that.

Vacuum uses and also updates visibility map while reclaiming dead tuples. Because of that, vacuum changes query plan for using index-only scan if possible.

https://www.postgresql.org/docs/9.6/static/sql-vacuum.html

https://www.postgresql.org/docs/9.6/static/storage-vm.html

answered Sep 19 '22 18:09

samed.yildirim

Related questions
                            
                                PostGIS - Route matching solution
                            
                                Mocking Postgresql now() function for testing
                            
                                Postgres Out of Memory
                            
                                Postgres Foreign-key constraints in non public schema
                            
                                PostgreSQL plpgsql try catch block equivalent
                            
                                The uninstall.dat file cannot be found in postgreSQL
                            
                                Django Postgres Connection Pooling
                            
                                Order of columns in INNER JOIN condition affects the performance badly
                            
                                PostgreSQL slow insert of a single row with RETURNING (taking 500ms in production)
                            
                                Group photos by color
                            
                                How to get the result of SELECT statement inside a transaction?
                            
                                What to use instead of Polymorphic Associations?
                            
                                Inserting "bigger" data into PostgreSQL makes the system faster?
                            
                                How to query and index JSON data nested multiple levels deep in PostgreSQL 9.3+?
                            
                                Mixing isolation levels in PostgreSQL
                            
                                PostgreSQL: duplicate key value violates unique constraint on UPDATE command
                            
                                ERROR: out of memory on machine with 32GB RAM and without swap file
                            
                                Laravel 5 - SQLSTATE[22P02] - invalid text representation with postgres
                            
                                Django queryset with annotate, why is GROUP BY applied to all fields?
                            
                                NGINX TLS termination for PostgreSQL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does vacuum analyze change query plan while analyze does not?

Tags:

postgresql

sql-execution-plan

postgresql-9.4

autovacuum

vacuum

greatvovan

People also ask

1 Answers

samed.yildirim

Recent Activity

Donate For Us