Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres EXPLAIN ANALYZE cost estimate row count massively higher than actual row count. No Vacuuming?

I have a Postgres 9.4.18 database running on Heroku in a Django project. I had noticed that queries were becoming increasingly slow so I ran an "EXPLAIN ANALYZE" on one query and noticed that for one node the row estimate was massively higher than the actual row count:

->  Seq Scan on listings_listing u1  (cost=0.00..1536692.01 rows=5030003 width=8) (actual time=0.811..11263.410 rows=173537 loops=1)

I then ran "VACUUM FULL ANALYZE" on the table and then reran the "EXPLAIN ANALYZE" on the query and got:

->  Seq Scan on listings_listing u1  (cost=0.00..23554.61 rows=173537 width=8) (actual time=0.001..33.884 rows=173537 loops=1)

The execution time is now 100x faster.

So the two questions are: A) shouldn't auto vacuuming be preventing this? (how do I check if that is enabled?) B) how did it get this way assuming the vacuuming isn't being performed?

--------------------------------- Update

I found this command from heroku that gives autovacuum stats and here is the output (unfortunately I ran it after my manual vacuum.

heroku pg:vacuum_stats DATABASE_URL

schema |                  table                  | last_vacuum | last_autovacuum  |    rowcount    | dead_rowcount  | autovacuum_threshold | expect_autovacuum 
--------+-----------------------------------------+-------------+------------------+----------------+----------------+----------------------+-------------------
 public | listings_listing                        |             | 2018-06-27 15:36 |        173,537 |              0 |         34,757       | 

It would seem the threshold indicated should have caused it to run vacuum a long time ago.

Also, here is the Heroku page for their documentation on vacuuming settings: https://devcenter.heroku.com/articles/managing-vacuum-on-heroku-postgres

like image 624
jeffjv Avatar asked Jul 03 '18 05:07

jeffjv


People also ask

How do you use explain analyze in Postgres?

The ANALYZE option causes the statement to be actually executed, not only planned. Then actual run time statistics are added to the display, including the total elapsed time expended within each plan node (in milliseconds) and the total number of rows it actually returned.

How do you analyze a query performance in PostgreSQL?

The ANALYZE option causes the statement to be actually executed, not only planned. The total elapsed time expended within each plan node (in milliseconds) and total number of rows it actually returned are added to the display. This is useful for seeing whether the planner's estimates are close to reality.

What is verbose in PostgreSQL?

Extends EXPLAIN by adding several extra pieces of information that don't necessarily have much in common. This includes schema-qualifying object names, labelling variables in expressions, naming triggers, providing thread information, and adding the query id.


1 Answers

To find out if autovacuum is enabled as it should be, run

SHOW autovacuum;

To find out if autovacuum is disabled for your specific table, run

SELECT reloptions FROM pg_class WHERE relname = 'listings_listing';

The answer to B) is simple:

If autovacuum isn't running, every UPDATE or DELETE will create a “dead tuple” (or “dead row version”) in the table. These will never be cleaned up unless you run VACUUM manually and will cause the table to grow so that sequential scans become slower.

The answer to A) is more difficult:

There are several things that can keep autovacuum from doing its job:

  • The rate of changes to this table can be so high that autovacuum, which runs slow be default so that it does not disturb normal activity, cannot keep up.

    In this case you should tune autovacuum to be more aggressive for this table:

    ALTER TABLE listings_listing SET (
       autovacuum_vacuum_cost_limit = 1000,
       toast.autovacuum_vacuum_cost_limit = 1000
    );
    

    If that is not good enough, you can

    ALTER TABLE listings_listing SET (
       autovacuum_vacuum_cost_delay = 0,
       toast.autovacuum_vacuum_cost_delay = 0
    );
    
  • There are concurrent long transactions.

    Autovacuum can only remove dead tuples that are older than the oldest running transaction, so long transactions can keep it from doing its job.

    There is more to the story; read this blog post.

    However, this would also keep VACUUM (FULL) from doing its job, so maybe that is not your problem.

  • The table is frequently locked with SHARE UPDATE EXCLUSIVE or stronger locks, for example by running “LOCK listings_listing”.

    When autovacuum encounters such a lock, it backs down rather than blocking user activity.

A useful way to determine what is going on is to query pg_stat_user_tables like this:

SELECT n_live_tup, n_dead_tup, last_vacuum, last_autovacuum
FROM pg_stat_user_tables
WHERE relname = 'listings_listing';

But that evidence is probably destroyed now that you have run VACUUM (FULL).

Another good thing to do is to set log_autovacuum_min_duration to something else than -1 and look into the log occasionally.

like image 99
Laurenz Albe Avatar answered Sep 30 '22 18:09

Laurenz Albe