Amazon EC2 postgresql backups: Snapshot the data directory or pg_dump onto an EBS volume that is snapshotted consistently?

Tags:

I have a postgresql database on amazon EC2 and need to determine the best way to keep this data backed up. I am considering two options:

(1) Mount an EBS volume to some directory like /pgsqldata and use this directory as the postgresql data directory (on Amazon Linux the default is /var/lib/pgsql/data/). Then this volume would get frequent snapshots.

(2) Keep the postgresql data directory in it's default location. Then use pg_dump to frequently dump backups to a location like /pgsqldumps and that volume will get a snapshot after each pg_dump.

A third option would be to simply snapshot the root device volume (I am using an EBS-backed instance) since it is both a webserver and database in my case. I like the idea of having a dedicated volume for data backups though.

Finally, if I am taking direct snapshots of the live postgresql data directory, do I need to worry about possible changes to the database during the snapshot process?

Thanks

321

asked Jul 23 '12 17:07

Lee Schmidt

1 Answers

You should move the volume to its own EBS volume anyway, this helps with write contention on the EBS volumes as well as other benefits. In addition, I have the logs writing to their own volume and back those up as well.

To answer the question, I do both. Having the EBS volume snapshotted and doing a dump of the database. This way if you want to sync your live data to a dev box (depending on the PII on the database) it is easy with a dump and restore, but you also can restore a new instance and attach a snapshot easily as well.If your database dump is less than 5gb you can sync it to S3 and forget about having to store the backups on their own volume, but if it isn't you will need to store it on its own EBS volume that is then also snapshotted on a regular basis.

Here is a script I wrote to do this, it might be outdated, but should work.

164

answered Oct 09 '22 01:10

chantheman

Related questions
                            
                                Is it possible to create a table with a variable name in PostgreSQL?
                            
                                Postgresql: split database between different machines
                            
                                Date query with Hibernate on Timestamp Column in PostgreSQL
                            
                                Conditionally set a column to its default value in Postgres
                            
                                Why is count(*) taking extremely long in one PostgreSQL database but not another?
                            
                                Find rows that have a field which is a substring of the search string
                            
                                Which DBMSs offer index-organized tables?
                            
                                Postgresql pivot? Crosstab?
                            
                                Is there a lightweight sql parser class written in PHP to do this? [closed]
                            
                                PostgreSQL - How to see Function Text/Source in pgAdmin?
                            
                                Error using SqlSoup with database views
                            
                                Search in all tables in PgAdmin
                            
                                Null value isn't unique
                            
                                Does the placement of a condition matter?
                            
                                Function executes faster without STRICT modifier?
                            
                                Is there any opensource tool for converting xml schema to database schema for linux?
                            
                                PostgreSQL connection limit exceeded for non-superusers
                            
                                PostgreSQL CREATE TEMPORARY TABLE inside a plpgsql function
                            
                                PostgreSQL query with smaller date range (result set) slower then one with bigger date range (result)
                            
                                Do cursors in Django run inside the open transaction?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Amazon EC2 postgresql backups: Snapshot the data directory or pg_dump onto an EBS volume that is snapshotted consistently?

Tags:

postgresql

amazon-ec2

backup

snapshot

Lee Schmidt

People also ask

1 Answers

chantheman

Recent Activity

Donate For Us