I am creating a mongodb/nodejs blogging system (similar to wordpress).
I currently have the images being saved on the disk and a pointer being placed in mongo. I was wondering since I have all sessions being stored in MongoDB to enable easy load balancing across servers, would storing the actual files in Mongo also be a smart idea for easy multiserver setups and/or performance gains.
If everything is stored in a DB, you can simply spawn more web servers and/or mongo replications to scale horizontally
Opinions?
Large objects, or "files", are easily stored in MongoDB. It is no problem to store 100MB videos in the database. This has a number of advantages over files stored in a file system. Unlike a file system, the database will have no problem dealing with millions of objects.
Documents are stored on disk using block compression to reduce storage usage. Documents are automatically uncompressed in memory when retrieved by the MongoDB server. Each collection & index is stored in a separate file within the storage.
MongoDB can easily be combined with different Database Management Systems, both SQL and NoSQL types. Document-oriented structure makes MongoDB schema dynamically flexible and different types of data can be easily stored and manipulated.
No, MongoDB is not a good place for storing files. If you want to store files, you should use storages like Amazon S3 or Google Could Storage. The good practice is to store the files in a storage and then to just save the URL of the uploaded image in the MongoDB.
MongoDB is a good option to store your files (I'm talking about GridFS), specially for the use case you described above. When you store files into MongoDB (GridFS, not documents), you get all the replication and sharding capability for free, which is awesome.
If you have to spawn a new server and you have the files already into MongoDB, all you have to do is to enable replication (thus scale horizontally). I'm sure this can save you a lot of headaches.
Resources:
Is GridFS fast and reliable enough for production?
http://www.mongodb.org/display/DOCS/GridFS
http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/
Aside from GridFS, you might be considering a cloud-based deployment. In that case, you might consider storing files in cloud-specific storage (Windows Azure has Blob Storage, for example). Sticking with Windows Azure for this example (since that's what I work with), you'd reference a file by its storage account URI. For example:
https://mystorageacct.blob.core.windows.net/mycontainer/myvideo.wmv
Since you'd be storing the MongoDB database itself in its own blob (and mounted as disk volume on your Linux or Windows VM), you could then choose to store your files in either the same storage account or a completely different storage account (with each storage account providing 100TB 200TB of storage).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With