I am running a vacuum on a very large table. When I run it, it says: <pre class="prettyprint"><code>bacula=# VACUUM FULL VERBOSE file_partition_19 bacula-# ; INFO: vacuuming "public.file_partition_19" INFO: "file_partition_19": found 16242451 removable, 21024161 nonremovable row versions in 900380 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 5.14s/14.42u sec elapsed 19.61 sec. VACUUM Time: 163784.767 ms bacula=# </code></pre> When it does this, it shows up to the <code>CPU</code> line fairly quickly, then waits a long time before it shows the final two lines (+ the prompt). This is reflected in the difference in time - "elapsed 19.61 sec", compared to to the "Time:" of 163 seconds (shown because I set <code>\timing on</code>). While I haven't timed them, both times are about right - start the command, wait 20 seconds, it then shows up to the "CPU" line, then waits about 3 minutes, then prints the rest. Is this normal? Why is it happening?

Based on the tag "postgres-9.3" you used for your question I am assuming that you have Postgres 9.3 version. you can refer this link just for your own knowledge about "VACUUM" and "VACUUM FULL" for the pre-9.0 versions of Postgres. VACUUM VS VACUUM FULL For Pre-9.0 versions of Postgres So as you have Postgres-9.3, the documentation says following: <blockquote> For clarity, 9.0 changes VACUUM FULL. As covered in the documentation, the VACUUM FULL implementation has been changed to one that's similar to using CLUSTER in older versions. This gives a slightly different set of trade-offs from the older VACUUM FULL described here. While the potential to make the database slower via index bloating had been removed by this change, it's still something you may want to avoid doing, due to the locking and general performance overhead of a VACUUM FULL. </blockquote> As per the current documentation, VACUUM FULL operation not only retrieves the space from the table where records are marked deleted but it also touches every valid record in the table and tries to reorganize them in DB pages so that's how it frees up more space then just VACUUM operation. So in the VERBOS result when we see the line <pre class="prettyprint"><code>CPU 5.14s/14.42u sec elapsed 19.61 sec </code></pre> it is the time taken by system process to go through the table and analyze the table and retrieve the space that is already marked. Then it starts the organizing the records into page file and hence depending on how much table pages are fragmented the process will take time. For example, if you have a new table and keep adding new records incrementally/sequentially so that new records gets added at the bottom of the page (based on the primary key defined). Now you perform delete operation in a reverse order so that records only gets deleted from the bottom of the page. Let's say you delete half of the records from the table. In this situation, there is no much page fragmentation(virtually 0) and hence when VACUMME FULL runs the second phase, it will still try to organize the valid records but because there is no fragmentation and hence it will not have to actually move any records and will finish faster. But, above explain situation is not the way update/delete happens in real world. Real word Update/Delete on table create lots of page fragmentation and hence during the second phase VACUUM FULL process has to actually move valid records into free space at the beginning of each page and hence takes more time. check the following sample output, <img src="https://i.stack.imgur.com/qJ7Kl.png" alt="VACUMM FULL Output"> I ran for very small dummy table. even though It has only 7 rows. VACUME PROCESS (First Phase) finishes in 0.03sec(30ms) but total query reported to finish in 61ms. So that tells me even though there is nothing to reorganize the process still checks how much if it can be reorganized and hence takes time. But if I have actually lots of fragmentation and reorganize happens then it would be much more completion time depending on page fragmentation.

Why does vacuum full wait after it is "done"?

Tags:

postgresql

postgresql-9.3

vacuum

I am running a vacuum on a very large table.

When I run it, it says:

bacula=# VACUUM FULL VERBOSE file_partition_19
bacula-# ;
INFO:  vacuuming "public.file_partition_19"
INFO:  "file_partition_19": found 16242451 removable, 21024161 nonremovable row versions in 900380 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 5.14s/14.42u sec elapsed 19.61 sec.
VACUUM
Time: 163784.767 ms
bacula=#

When it does this, it shows up to the CPU line fairly quickly, then waits a long time before it shows the final two lines (+ the prompt). This is reflected in the difference in time - "elapsed 19.61 sec", compared to to the "Time:" of 163 seconds (shown because I set \timing on).

While I haven't timed them, both times are about right - start the command, wait 20 seconds, it then shows up to the "CPU" line, then waits about 3 minutes, then prints the rest.

Is this normal? Why is it happening?

978

asked Jan 18 '17 22:01

AMADANON Inc.

2 Answers

It's mostly rebuilding all indizes on the table, which it has to do since basically "VACUUM FULL" does a full rewrite of the table. If you remove all indizes from your table, there should be almost no delay after the "CPU" line.

AFAICT, the CPU usage line is printed by a generic routine that does most of the work for other (non-FULL) vacuum modes. It is meaningless in the "VACUUM FULL" case.

If you are concerned that it takes too long, I recommend that you have a look at the "When to use VACUUM FULL and when not to" from the PostgreSQL wiki. 9 times out of 10 when people are using VACUUM FULL they actually shouldn't.

146

answered Sep 19 '22 19:09

aferber

Based on the tag "postgres-9.3" you used for your question I am assuming that you have Postgres 9.3 version.

you can refer this link just for your own knowledge about "VACUUM" and "VACUUM FULL" for the pre-9.0 versions of Postgres.

VACUUM VS VACUUM FULL For Pre-9.0 versions of Postgres

So as you have Postgres-9.3, the documentation says following:

For clarity, 9.0 changes VACUUM FULL. As covered in the documentation, the VACUUM FULL implementation has been changed to one that's similar to using CLUSTER in older versions. This gives a slightly different set of trade-offs from the older VACUUM FULL described here. While the potential to make the database slower via index bloating had been removed by this change, it's still something you may want to avoid doing, due to the locking and general performance overhead of a VACUUM FULL.

As per the current documentation, VACUUM FULL operation not only retrieves the space from the table where records are marked deleted but it also touches every valid record in the table and tries to reorganize them in DB pages so that's how it frees up more space then just VACUUM operation. So in the VERBOS result when we see the line

CPU 5.14s/14.42u sec elapsed 19.61 sec

it is the time taken by system process to go through the table and analyze the table and retrieve the space that is already marked. Then it starts the organizing the records into page file and hence depending on how much table pages are fragmented the process will take time.

For example, if you have a new table and keep adding new records incrementally/sequentially so that new records gets added at the bottom of the page (based on the primary key defined). Now you perform delete operation in a reverse order so that records only gets deleted from the bottom of the page. Let's say you delete half of the records from the table. In this situation, there is no much page fragmentation(virtually 0) and hence when VACUMME FULL runs the second phase, it will still try to organize the valid records but because there is no fragmentation and hence it will not have to actually move any records and will finish faster.

But, above explain situation is not the way update/delete happens in real world. Real word Update/Delete on table create lots of page fragmentation and hence during the second phase VACUUM FULL process has to actually move valid records into free space at the beginning of each page and hence takes more time.

check the following sample output,

VACUMM FULL Output

I ran for very small dummy table. even though It has only 7 rows. VACUME PROCESS (First Phase) finishes in 0.03sec(30ms) but total query reported to finish in 61ms. So that tells me even though there is nothing to reorganize the process still checks how much if it can be reorganized and hence takes time. But if I have actually lots of fragmentation and reorganize happens then it would be much more completion time depending on page fragmentation.

answered Sep 18 '22 19:09

Anup Shah

Related questions
                            
                                ERROR: column is of type json but expression is of type character varying in Hibernate
                            
                                Is there a mature way to interface Erlang and PostgreSQL or MySQL? [closed]
                            
                                suggest like google with postgresql trigrams and full text search
                            
                                Query rows by time of creation?
                            
                                Escaping hstore contains operators in a JDBC Prepared statement
                            
                                Create operator class for pattern matching like text_pattern_ops
                            
                                PostgreSQL - INNER JOIN two tables with a LIMIT
                            
                                Does Slick support changing the schema dynamically per query?
                            
                                Postgresql raises 'data directory has wrong ownership' when trying to use volume
                            
                                UUID column value not available until a model reload
                            
                                DatingApp programming: ActiveRecord association for finding Users where there are no Approvals or one 1 way approvals
                            
                                Meta commands in Psycopg2 - \d not working
                            
                                How to setup database.yml to connect to Postgres Docker container?
                            
                                pg_restore to postgres running in docker container
                            
                                What is the limit on the number of rows that can be inserted in a single insert statement with PostgreSQL 9.4?
                            
                                Is using Heroku Postgres secure against MITM attacks?
                            
                                How to make criteria with array field in Hibernate
                            
                                LISTEN/NOTIFY pgconnection goes down java?
                            
                                Django JSONField isnull lookup
                            
                                How do I tell sqlalchemy to ignore certain (say, null) columns on INSERT

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With