Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon EC2 postgresql backups: Snapshot the data directory or pg_dump onto an EBS volume that is snapshotted consistently?

I have a postgresql database on amazon EC2 and need to determine the best way to keep this data backed up. I am considering two options:

(1) Mount an EBS volume to some directory like /pgsqldata and use this directory as the postgresql data directory (on Amazon Linux the default is /var/lib/pgsql/data/). Then this volume would get frequent snapshots.

or

(2) Keep the postgresql data directory in it's default location. Then use pg_dump to frequently dump backups to a location like /pgsqldumps and that volume will get a snapshot after each pg_dump.

A third option would be to simply snapshot the root device volume (I am using an EBS-backed instance) since it is both a webserver and database in my case. I like the idea of having a dedicated volume for data backups though.

Finally, if I am taking direct snapshots of the live postgresql data directory, do I need to worry about possible changes to the database during the snapshot process?

Thanks

like image 321
Lee Schmidt Avatar asked Jul 23 '12 17:07

Lee Schmidt


People also ask

Which service should you choose to manage the data backup for these EBS volumes?

Currently AWS Backup is the preferred solution. AWS Backup is a fully managed service that not only protects EBS volumes, but also offers backup capabilities for EC2 instances, Amazon RDS, Storage Gateway, DynamoDB, EFS, and Aurora.

What happens to EBS volume during snapshot?

When you create an EBS volume based on a snapshot, the new volume begins as an exact replica of the original volume that was used to create the snapshot. The replicated volume loads data in the background so that you can begin using it immediately.

What is a best practice when using multi volume crash consistent snapshots?

I recommend that you tag your multiple volume snapshots to manage them collectively during restore, copy, or retention. Typically, multi-volume, crash-consistent snapshots are restored as a set.

Does AWS backup use snapshots?

Amazon Data Lifecycle Manager (DLM) policies and backup plans created in AWS Backup work independently from each other and provide two ways to manage EBS Snapshots. DLM provides a simple way to manage the lifecycle of EBS resources, such as volume snapshots.


1 Answers

You should move the volume to its own EBS volume anyway, this helps with write contention on the EBS volumes as well as other benefits. In addition, I have the logs writing to their own volume and back those up as well.

To answer the question, I do both. Having the EBS volume snapshotted and doing a dump of the database. This way if you want to sync your live data to a dev box (depending on the PII on the database) it is easy with a dump and restore, but you also can restore a new instance and attach a snapshot easily as well.If your database dump is less than 5gb you can sync it to S3 and forget about having to store the backups on their own volume, but if it isn't you will need to store it on its own EBS volume that is then also snapshotted on a regular basis.

Here is a script I wrote to do this, it might be outdated, but should work.

like image 164
chantheman Avatar answered Oct 09 '22 01:10

chantheman