PostgreSQL: improving pg_dump, pg_restore performance

Tags:

When I began, I used pg_dump with the default plain format. I was unenlightened.

Research revealed to me time and file size improvements with pg_dump -Fc | gzip -9 -c > dumpfile.gz. I was enlightened.

When it came time to create the database anew,

# create tablespace dbname location '/SAN/dbname'; # create database dbname tablespace dbname; # alter database dbname set temp_tablespaces = dbname;  % gunzip dumpfile.gz              # to evaluate restore time without a piped uncompression % pg_restore -d dbname dumpfile   # into a new, empty database defined above

I felt unenlightened: the restore took 12 hours to create the database that's only a fraction of what it will become:

# select pg_size_pretty(pg_database_size('dbname')); 47 GB

Because there are predictions this database will be a few terabytes, I need to look at improving performance now.

Please, enlighten me.

396

asked Jan 19 '10 16:01

Joe Creighton

2 Answers

First check that you are getting reasonable IO performance from your disk setup. Then check that you PostgreSQL installation is appropriately tuned. In particular shared_buffers should be set correctly, maintenance_work_mem should be increased during the restore, full_page_writes should be off during the restore, wal_buffers should be increased to 16MB during the restore, checkpoint_segments should be increased to something like 16 during the restore, you shouldn't have any unreasonable logging on (like logging every statement executed), auto_vacuum should be disabled during the restore.

If you are on 8.4 also experiment with parallel restore, the --jobs option for pg_restore.

156

answered Sep 18 '22 22:09

Ants Aasma

Improve pg dump&restore

PG_DUMP | always use format-directory and -j options

time pg_dump -j 8 -Fd -f /tmp/newout.dir fsdcm_external

PG_RESTORE | always use tuning for postgres.conf and format-directory and -j options

work_mem = 32MB shared_buffers = 4GB maintenance_work_mem = 2GB full_page_writes = off autovacuum = off wal_buffers = -1  time pg_restore -j 8 --format=d -C -d postgres /tmp/newout.dir/

answered Sep 18 '22 22:09

Yanar Assaf

Related questions
                            
                                Is x += a quicker than x = x + a?
                            
                                How to speed adding items to a ListView?
                            
                                When should I call SaveChanges() when creating 1000's of Entity Framework objects? (like during an import)
                            
                                Large difference in speed of equivalent static and non static methods
                            
                                Knockout.js incredibly slow under semi-large datasets
                            
                                java how expensive is a method call
                            
                                Why is processing a sorted array *slower* than an unsorted array? (Java's ArrayList.indexOf)
                            
                                What's the difference between reflow and repaint?
                            
                                Is it good practice to make getters and setters inline?
                            
                                What is the best way to measure execution time of a function? [duplicate]
                            
                                String.Join vs. StringBuilder: which is faster?
                            
                                Why is tuple faster than list in Python?
                            
                                Huge performance difference when using GROUP BY vs DISTINCT
                            
                                Trying to understand gcc option -fomit-frame-pointer
                            
                                React Navigation vs. React Native Navigation [closed]
                            
                                CSS3 Transitions: Is "transition: all" slower than "transition: x"?
                            
                                SQL 'like' vs '=' performance
                            
                                Why is vectorization, faster in general, than loops?
                            
                                Java 8 times faster with arrays than std::vector in C++. What did I do wrong?
                            
                                Are 64 bit programs bigger and faster than 32 bit versions?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PostgreSQL: improving pg_dump, pg_restore performance

Tags:

performance

postgresql

backup

restore