Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL: improving pg_dump, pg_restore performance

When I began, I used pg_dump with the default plain format. I was unenlightened.

Research revealed to me time and file size improvements with pg_dump -Fc | gzip -9 -c > dumpfile.gz. I was enlightened.

When it came time to create the database anew,

# create tablespace dbname location '/SAN/dbname'; # create database dbname tablespace dbname; # alter database dbname set temp_tablespaces = dbname;  % gunzip dumpfile.gz              # to evaluate restore time without a piped uncompression % pg_restore -d dbname dumpfile   # into a new, empty database defined above 

I felt unenlightened: the restore took 12 hours to create the database that's only a fraction of what it will become:

# select pg_size_pretty(pg_database_size('dbname')); 47 GB 

Because there are predictions this database will be a few terabytes, I need to look at improving performance now.

Please, enlighten me.

like image 396
Joe Creighton Avatar asked Jan 19 '10 16:01

Joe Creighton


People also ask

Does pg_dump affect performance?

The only impact of pg_dump are the increased I/O load and the long running transaction it creates. The long transaction will keep autovacuum from reclaimimg dead tuples for the duration of the dump. Normally that is no big problem unless you have very high write activity in the database.

Why is pg_dump so slow?

It seams that the pg_dump compression is rather slow if data is already compressed as it is with image data in a bytea format. And it is better to compress outside of pg_dump (-Z0). The time has dropped from ~70 minutes to ~5 minutes.

Is pg_dump slow?

The pg_dump/pg_restore utilities are fantastic tools for migrating from a Postgres database to another Postgres database. However, they can drastically slow down when there are very large tables in the database.

How long does a Pg_restore take?

During pg_restore , the database size is increasing at a rate of 50 MB/minute estimated using the SELECT pg_size_pretty(pg_database_size()) query. At this rate, it will take approximately 130 hours to complete the restore which is a very long time.


2 Answers

First check that you are getting reasonable IO performance from your disk setup. Then check that you PostgreSQL installation is appropriately tuned. In particular shared_buffers should be set correctly, maintenance_work_mem should be increased during the restore, full_page_writes should be off during the restore, wal_buffers should be increased to 16MB during the restore, checkpoint_segments should be increased to something like 16 during the restore, you shouldn't have any unreasonable logging on (like logging every statement executed), auto_vacuum should be disabled during the restore.

If you are on 8.4 also experiment with parallel restore, the --jobs option for pg_restore.

like image 156
Ants Aasma Avatar answered Sep 18 '22 22:09

Ants Aasma


Improve pg dump&restore

PG_DUMP | always use format-directory and -j options

time pg_dump -j 8 -Fd -f /tmp/newout.dir fsdcm_external 

PG_RESTORE | always use tuning for postgres.conf and format-directory and -j options

work_mem = 32MB shared_buffers = 4GB maintenance_work_mem = 2GB full_page_writes = off autovacuum = off wal_buffers = -1  time pg_restore -j 8 --format=d -C -d postgres /tmp/newout.dir/ 
like image 31
Yanar Assaf Avatar answered Sep 18 '22 22:09

Yanar Assaf