Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I backup a MongoDB GridFS database the easiest way?

Like the title says, I have a MongoDB GridFS database with a whole range of file types (e.g., text, pdf, xls), and I want to backup this database the easiest way.

Replication is not an option. Preferably I'd like to do it the usual database way of dumping the database to file and then backup that file (which could be used to restore the entire database 100% later on if needed). Can that be done with mongodump? I also want the backup to be incremental. Will that be a problem with GridFS and mongodump?

Most importantly, is that the best way of doing it? I am not that familiar with MongoDB, will mongodump work as well as mysqldump does with MySQL? Whats the best practice for MongoDB GridFS and incremental backups?

I am running Linux if that makes any difference.

like image 262
c00kiemonster Avatar asked Jan 19 '12 13:01

c00kiemonster


People also ask

Which collections are used to store GridFS data in MongoDB?

GridFS stores files in two collections: chunks stores the binary chunks. For details, see The chunks Collection. files stores the file's metadata.

What are the backup approaches available in MongoDB?

Back Up with mongodumpmongodump and mongorestore are simple and efficient tools for backing up and restoring small MongoDB deployments, but are not ideal for capturing backups of larger systems. mongodump and mongorestore operate against a running mongod process, and can manipulate the underlying data files directly.


1 Answers

GridFS stores files in two collections: fs.files and fs.chunks.

More information on this may be found in the GridFS Specification document: http://www.mongodb.org/display/DOCS/GridFS+Specification

Both collections may be backed up using mongodump, the same as any other collection. The documentation on mongodump may be found here: http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongodump

From a terminal, this would look something like the following:

For this demonstration, my db name is "gridFS":

First, mongodump is used to back the fs.files and fs.chunks collections to a folder on my desktop:

$ bin/mongodump --db gridFS --collection fs.chunks --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS     to     /Desktop/gridFS
    gridFS.fs.chunks to /Desktop/gridFS/fs.chunks.bson
         3 objects
$ bin/mongodump --db gridFS --collection fs.files --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS     to     /Desktop/gridFS
    gridFS.fs.files to /Users/mbastien/Desktop/gridfs/gridFS/fs.files.bson
         3 objects

Now, mongorestore is used to pull the backed-up collections into a new (for the purpose of demonstration) database called "gridFScopy"

$ bin/mongorestore --db gridFScopy --collection fs.chunks /Desktop/gridFS/fs.chunks.bson 
connected to: 127.0.0.1
Thu Jan 19 12:38:43 /Desktop/gridFS/fs.chunks.bson
Thu Jan 19 12:38:43      going into namespace [gridFScopy.fs.chunks]
3 objects found
$ bin/mongorestore --db gridFScopy --collection fs.files /Desktop/gridFS/fs.files.bson 
connected to: 127.0.0.1
Thu Jan 19 12:39:37 /Desktop/gridFS/fs.files.bson
Thu Jan 19 12:39:37      going into namespace [gridFScopy.fs.files]
3 objects found

Now the Mongo shell is started, so that the restore can be verified:

$ bin/mongo
MongoDB shell version: 2.0.2
connecting to: test
> use gridFScopy
switched to db gridFScopy
> show collections
fs.chunks
fs.files
system.indexes
> 

The collections fs.chunks and fs.files have been successfully restored to the new DB.

You can write a script to perform mongodump on your fs.files and fs.chunks collections periodically.

As for incremental backups, they are not really supported by MongoDB. A Google search for "mongodb incremental backup" reveals a good mongodb-user Google Groups discussion on the subject: http://groups.google.com/group/mongodb-user/browse_thread/thread/6b886794a9bf170f

For continuous back-ups, many users use a replica set. (Realizing that in your original question, you stated that this is not an option. This is included for other members of the Community who may be reading this response.) A member of a replica set can be hidden to ensure that it will never become Primary and will never be read from. More information on this may be found in the "Member Options" section of the Replica Set Configuration documentation. http://www.mongodb.org/display/DOCS/Replica+Set+Configuration#ReplicaSetConfiguration-Memberoptions

like image 147
Marc Avatar answered Oct 22 '22 21:10

Marc