Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently vacuum analyze tables in Postgres

I had a huge query running on postgres and one of the join tables always did a sequential scan. There is an index on the constraint column and postgres just didn't use it. I ran a VACUUM ANALYZE, and then the postgres query plan indicates that an index scan is now being used.

My question is, what is the most efficient way to run a VACUUM ANALYZE? Does it lock tables? If so, how do you run VACUUM ANALYZE on live production tables?

like image 424
Ramanan Avatar asked May 16 '16 19:05

Ramanan


1 Answers

"Vacuum Analyze" actually performs 2 entirely different tasks.

  1. Vacuum is used to free up space occupied by dead tuples/rows.
  2. Analyze is used to analyze the contents of the table, which in turn helps planner to create better query plans.

"Vacuum Analyze" is a manual cleanup operation and it is usually done once a week or month, depending on the frequency of update/deletes performed on the database. This operation can be performed on specific tables or on the whole database. This takes somewhere from 30 mins to even days depending on the size of database and how often do you perform this operation.

When to use VACUUM FULL and ANALYZE:

If your database is taking too much space and there is no space left for your OS to perform any other operation then you need to do a VACUUM FULL, it is also recommend to add the ANALYZE option to it. If you have a high write frequency database, then i would recommend to perform this operation at least once every 3-6 months.

VACUUM(FULL, ANALYZE, VERBOSE);

If you cant lock the whole database and you just need to free up space taken by a table Which does alot of updates/deletes. Then go for VACUUM FULL on specific table

VACUUM FULL VERBOSE your_table_name;

If you are facing a problem, where your queries become slower over time, i.e if you run EXPLAIN on a query and sometime it uses sequential scan and the same query with different parameters uses index scan. Then this means that your table is not completely Analyzed. Analyzing can be performed on the whole database or on specific table. The database or table does not get locked during this operation and your queries will perform better after this operation.

ANALYZE VERBOSE your_table_name

Auto Analyze:

Although you might never need to manually ANALYZE a database, as this is automatically done by the auto analyze deamon which runs in the background and analyzes tables which surpass a certain threshold of updates/deletes which is by default 10% of the table size. But on large tables this threshold is never met and the query becomes slow even on 5% threshold. Therefore ANALYZE should be manually performed along with VACUUM FULL on regular intervals.

Auto Vacuum:

Auto Vacuum is another deamon which runs in the background and Vacuum tables without locking them. Auto Vacuum also runs Auto Analyze with it, therefore auto vacuum will also auto analyze a table. The condition that needs to be met in order for the auto vacuum to perform the operation on a table is by default set at 20% of updates/deletes of the size of the table.

Example:

A table of 40 Million rows, the auto vacuum will run when the table will receive 8 Million updates or deletes. Similarly the table needs to receive 4 Million updates or deletes in order for the auto analyze to start. Mostly tables of such size would become slow before this threshold is received, due to which manually VACUUM FULL ANALYZE is recommend on a regular basis.

like image 57
Omer Farooq Avatar answered Sep 21 '22 10:09

Omer Farooq