Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL - how to run VACUUM from code outside transaction block?

I am using Python with psycopg2 and I'm trying to run a full VACUUM after a daily operation which inserts several thousand rows. The problem is that when I try to run the VACUUM command within my code I get the following error:

psycopg2.InternalError: VACUUM cannot run inside a transaction block 

How do I run this from the code outside a transaction block?

If it makes a difference, I have a simple DB abstraction class, a subset of which is displayed below for context (not runnable, exception-handling and docstrings omitted and line spanning adjustments made):

class db(object):     def __init__(dbname, host, port, user, password):         self.conn = psycopg2.connect("dbname=%s host=%s port=%s \                                       user=%s password=%s" \                                       % (dbname, host, port, user, password))          self.cursor = self.conn.cursor()      def _doQuery(self, query):         self.cursor.execute(query)         self.conn.commit()      def vacuum(self):         query = "VACUUM FULL"         self._doQuery(query) 
like image 698
Wayne Koorts Avatar asked Jun 19 '09 11:06

Wayne Koorts


People also ask

How do I run a vacuum in PostgreSQL?

If you wanted to vacuum all tables and minimize the database file by returning the unused space to the operating system, you would run the following vacuum statement: VACUUM FULL; This example would rewrite all tables into a new file, thus requiring an exclusive lock on each table.

Do I need to commit after vacuum?

Vacuum is like defrag, it's good to do if youve recently deleted a lot of stuff, or maybe after youve inserted a lot of stuff, but by no means should you do it in every transaction. It's slower than almost any other database command and is more of a maintenance task.

Does Postgres vacuum automatically?

PostgreSQL database tables are auto-vacuumed by default when 20% of the rows plus 50 rows are inserted, updated, or deleted. Tables are auto-analyzed when a threshold is met for 10% of the rows plus 50 rows. For example, a table with 10000 rows is not auto-vacuumed until 2050 rows are inserted, updated, or deleted.

Does vacuum full reindex?

A REINDEX immediately after a VACUUM FULL is useless because VACUUM FULL itself rebuilds the indexes.


2 Answers

After more searching I have discovered the isolation_level property of the psycopg2 connection object. It turns out that changing this to 0 will move you out of a transaction block. Changing the vacuum method of the above class to the following solves it. Note that I also set the isolation level back to what it previously was just in case (seems to be 1 by default).

def vacuum(self):     old_isolation_level = self.conn.isolation_level     self.conn.set_isolation_level(0)     query = "VACUUM FULL"     self._doQuery(query)     self.conn.set_isolation_level(old_isolation_level) 

This article (near the end on that page) provides a brief explanation of isolation levels in this context.

like image 65
Wayne Koorts Avatar answered Oct 10 '22 06:10

Wayne Koorts


Additionally, you can also get the messages given by the Vacuum or Analyse using:

>> print conn.notices #conn is the connection object 

this command print a list with the log message of queries like Vacuum and Analyse:

INFO:  "usuario": processados 1 de 1 páginas, contendo 7 registros vigentes e 0 registros não vigentes; 7 registros amostrados, 7 registros totais estimados    INFO:  analisando "public.usuario" 

This can be useful to the DBAs ^^

like image 33
Diego Guimaraes Avatar answered Oct 10 '22 05:10

Diego Guimaraes