I have my db kept at /mnt, using ephemeral storage that comes with ec2 instance. To take the backup using ec2 api tools we need a volume id, but in the aws console I can find the volume id of only the 8gb root storage.
What should I do if want the backup of ephemeral storage? Is there any alternative for backing up instance storage?
Ephemeral storage, or instance storage, as-is, is like a /tmp folder, the contents of which disappear after a reboot. Of course, ephemeral drive contents aren't destroyed on a soft reboot, but they should be treated as if they were, since you can't realistically control or predict when your instance decides to die.
This has already been pointed out.
What I'd like to point out, is that if you create and configure your AMIs appropriately, you can still use the ephemeral storage to drastically improve (read) throughput, so long as you also keep EBS drives for the actual storage.
What I'm using at the moment is Linux (Ubuntu Tahr) instances with bcache. This is mainly because bcache kernel support is relatively new (IIRC, first one with bcache was 3.10), and you'd definitely want as recent a kernel as possible. Also, Tahr is the next LTS version of Ubuntu, and it's final when my project is close to launch ;)
Bcache, in its default configuration, allows you to benefit from the read speed of the ephemeral storage while giving you the persistence of EBS: It takes a fast cache device (ephemeral SSD) and uses it to speed up a slow device (EBS), writing through the cache device (that is, writing simultaneously to ephemeral cache and EBS).
This means that should an instance crash or otherwise be stopped, you can still mount the EBS volume directly without the cache, and access all your data as you would otherwise using only EBS volumes. You can also reconfigure the now wiped ephemeral devices and re-configure them as a cache to the EBS to get back to enjoying very fast reads and seeks.
My particular setup is two EBS devices, raided in stripe mode using mdadm + two ephemeral SSD devices also raided in the same manner. Then I've configured them with bcache, using the ephemeral array as the cache, and the EBS array as the "backup" device. The EBS drives can be any size, and you can always expand them (a bit tricky with EC2, because you have to create a snapshot of the current EBS volumes, and then create new larger ones based on that snapshot — you can't resize an existing EBS volume).
Of course, you'll have to create a script that runs inside your instance at startup to configure the ephemeral storage and attach it as a cache device on your EBS-backed backup device. I encourage reading up on, and experimenting with, mdadm and bcache.
For the record, testing with the Cassandra stress tool, I get better read performance with EBS volumes bcached with the ephemeral drives than I do with just striping the ephemeral drives. This is because of the algorithm used in bcache, which is very clever.
Using the ephemeral drives as a cache also reduced network traffic and is cost-effective, as it reduces I/O on EBS, and thereby your monthly bill.
Also note the different types of caching bcache provides:
First and foremost, you should never store anything of lasting value on ephemeral storage in Amazon EC2, except if you know exactly what you are doing and are prepared to always have point in time backups etc. - your question seems to indicate that you might be mistaken about the concept of ephemeral storage, the respective difference between Amazon EC2 Instance Storage an Amazon EBS and the significant implications regarding data safety and backup requirements:
Ephemeral storage will be lost on stop/start cycles and can generally go away, so you definitely don't want to put anything of lasting value there, i.e. only put temporary data there you can afford to lose or rebuild easily, like a swap file or strictly temporary data in use during computations. Of course you might store huge indexes there for example, but must be prepared to rebuild these after the storage has been cleared for whatever reason (instance reboot, hardware failure, ...).
These explanations should clarify why you are unable to backup the ephemeral storage volumes with a mechanism that solely applies to EBS volumes (i.e. EBS snapshots). Accordingly, you can backup the former via regular operating system level backup tool of your choice, with Duplicity being a popular choice optionally facilitating Amazon S3 for example, as addressed in my answer to Easiest to use backup software for live linux server.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With