Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDb Hot backup - copy data/db VS replicaset with fsyncLock

I read about the different MongoDB setups for doing backup without downtime. Which strategy is best or can they even be compared?

  1. Enable journaling and simply copy the /data/db directory - it is unclear to me if this is enough – on the MongoDB home page it states that you have to "snapshot it" and it works on SAN and LVM as examples.

    Questions:

    What does snapshot mean in this context will a copy command count as a snapshot? Is it save to copy a journaling MongoDB (2.0+) data directory on a Windows server with NTFS? How do you ensure that it is safe to do on your own filesystem and setup?

  2. Establish a replica set with 2 servers and an arbiter. Then use rs.status() and fsyncLock/unlock to ensure data is read only on the secondary server while doing backup.

    > db.fsyncLock
    function () {
        return db.adminCommand({fsync:1, lock:true});
    }
    > db.fsyncUnlock
    function () {
        return db.getSiblingDB("admin").$cmd.sys.unlock.findOne();
    }
    

    Questions:

    If you use locks in a replica set it seems that writes and reads can be locked for the whole replica set and this bug has not been fixed?

    What if the secondary is voted in as primary while the backup is in progress? Will the backup process stop or will the replica set stop responding to write requests until it is unlocked?

    Considerations:

    For now I would like the simple solution and simply copy the data/db directory with journal files and wait with the replica set. MongoDB runs on a 64 bit Windows server (RackSpace Cloud).

like image 581
user1240303 Avatar asked Feb 29 '12 13:02

user1240303


1 Answers

The best bet is to do fsync + lock on a secondary, then snapshot the volume at the disk or volume level (e.g. using lvm2, hyper-v, btrfs), unlock the database, then copy the snapshotted data files. This minimizes downtime of the secondary and is easy to restore.

"Snapshotting" in this context refers to the snapshot features offered by some volume managers, file systems and hypervisors. Essentially, this is a 'copy-on-write' feature for block devices: instead of overwriting data when the OS demands it, it will write the new data elsewhere and keep both the old version and the new version readable. Snapshotting usually takes almost no time, but on some systems, it's a bad idea to keep many snapshots of the same files, because it may dramatically slow future writes.

Why I believe this is the best strategy for full backups:

  1. Using mongodump won't store the index data The indexes will be restored, but rebuilding indexes for recovery can take hours - the last thing you need when everybody is yelling at you is an operation that takes hours and can't be accelerated.

  2. Fsync + lock will block writers and might block readers hence, it's best to do that on a (passive) secondary, not on the primary.

  3. Halting a secondary will fill the oplog which is why you should keep the lock time as short as possible. Instead of copying all data files (which could take hours) during the lock, merely performing a snapshot should take only a couple of seconds. Hence, oplog limits are not a concern.

  4. Everything is 'back to normal' while the actual copy is running, which gives you peace of mind. The only difference will be higher load on a secondary during the backup, which shouldn't be a major concern.

Addressing your questions:

  • regarding locks in replica sets: Keep the lock time short, and use a passive secondary (which can't be elected master) so the writer queue can't stall.

  • "What if the secondary is voted in as primary while the backup is in progress" can't happen if your backup system is passive

For now I would like the simple solution and simply copy the data/db dir with journal files and wait with the replica set. The MongoDB runs on a 64 bit Windows server (RackSpace Cloud).

You can do that. Volume snapshotting is probably still the best way to go, giving you only seconds of downtime. If your data is small, a simple mongodump might be even easier, but make sure recovery times are acceptable (depends on your indexes).

like image 60
mnemosyn Avatar answered Nov 10 '22 15:11

mnemosyn