I am using Python with psycopg2 and I'm trying to run a full VACUUM
after a daily operation which inserts several thousand rows. The problem is that when I try to run the VACUUM
command within my code I get the following error:
psycopg2.InternalError: VACUUM cannot run inside a transaction block
How do I run this from the code outside a transaction block?
If it makes a difference, I have a simple DB abstraction class, a subset of which is displayed below for context (not runnable, exception-handling and docstrings omitted and line spanning adjustments made):
class db(object): def __init__(dbname, host, port, user, password): self.conn = psycopg2.connect("dbname=%s host=%s port=%s \ user=%s password=%s" \ % (dbname, host, port, user, password)) self.cursor = self.conn.cursor() def _doQuery(self, query): self.cursor.execute(query) self.conn.commit() def vacuum(self): query = "VACUUM FULL" self._doQuery(query)
If you wanted to vacuum all tables and minimize the database file by returning the unused space to the operating system, you would run the following vacuum statement: VACUUM FULL; This example would rewrite all tables into a new file, thus requiring an exclusive lock on each table.
Vacuum is like defrag, it's good to do if youve recently deleted a lot of stuff, or maybe after youve inserted a lot of stuff, but by no means should you do it in every transaction. It's slower than almost any other database command and is more of a maintenance task.
PostgreSQL database tables are auto-vacuumed by default when 20% of the rows plus 50 rows are inserted, updated, or deleted. Tables are auto-analyzed when a threshold is met for 10% of the rows plus 50 rows. For example, a table with 10000 rows is not auto-vacuumed until 2050 rows are inserted, updated, or deleted.
A REINDEX immediately after a VACUUM FULL is useless because VACUUM FULL itself rebuilds the indexes.
After more searching I have discovered the isolation_level property of the psycopg2 connection object. It turns out that changing this to 0
will move you out of a transaction block. Changing the vacuum method of the above class to the following solves it. Note that I also set the isolation level back to what it previously was just in case (seems to be 1
by default).
def vacuum(self): old_isolation_level = self.conn.isolation_level self.conn.set_isolation_level(0) query = "VACUUM FULL" self._doQuery(query) self.conn.set_isolation_level(old_isolation_level)
This article (near the end on that page) provides a brief explanation of isolation levels in this context.
Additionally, you can also get the messages given by the Vacuum or Analyse using:
>> print conn.notices #conn is the connection object
this command print a list with the log message of queries like Vacuum and Analyse:
INFO: "usuario": processados 1 de 1 páginas, contendo 7 registros vigentes e 0 registros não vigentes; 7 registros amostrados, 7 registros totais estimados INFO: analisando "public.usuario"
This can be useful to the DBAs ^^
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With