We are using MongoDB. In our application we need to store multiple files per user. Our application will be deployed on AWS. I was thinking of using one of the below options to store the files -
Which of the above is better approach? In terms of Performance, scalability? Which approach will be more scalable? I also want to use CDN for caching the files. My preference is to AWS S3, as I can use CDN to cache the files, my storage of the files will be DB agnostic. Also My DB size will not grow significantly as I am storing the files outside the DB.
In MongoDB, use GridFS for storing files larger than 16 MB. In some situations, storing large files may be more efficient in a MongoDB database than on a system-level filesystem. If your filesystem limits the number of files in a directory, you can use GridFS to store as many files as needed.
MongoDB stores data and indexes on disk in a compressed binary format.
By default Mongo stores its data in the directory /data/db . You can specify a different directory using the --dbpath option. If you're running Mongo on Windows then the directory will be C:\data\db , where C is the drive letter of the working directory in which Mongo was started.
GridFS is the MongoDB specification for storing and retrieving large files such as images, audio files, video files, etc. It is kind of a file system to store files but its data is stored within MongoDB collections. GridFS has the capability to store files even greater than its document size limit of 16MB.
I think the best options here are GridFS and S3; I would go with the latter myself. Push the file to S3 and then store the bucket name and file key in your Mongo document. Unless your business or querying requirements are such that all the data must be present in the document, I think this is the best way to go.
I've used this solution in production and it scales very easily. The impact to your Mongo collection is small and you don't have to worry about storing huge amounts of data there. Just store the key and let S3 take care of all that. You can always store them somewhere else later since your system is fairly storage-agnostic.
First I will only use local file system on develop mode, on production I will use GridFS or Amazon S3.
Lets put that words more clear.
First point (Store the files in MongoDB itself.)
Contras
Every time you make a query, you should know that you are looking into all the collection, so it will take a little bit (you can exclude the field 'images' to avoid this).
Second point (Use MongoDB's GridFS to store the files.)
Take a look into GridFs Docs.
This article talk about pros and cons about using GridFs can be helpful too
also when should I use GridFS?
Third point (Use Amazon S3)
I'm not very familiar with S3 (I never use it).
But this is from Amazon docs when should I use Amazon S3?
S3 is free to join, and is a pay-as-you-go sevice, meaning you only ever pay for any of the hosting and bandwidth costs that you use, making it very attractive for start-up, agile and lean companies looking to minimize costs.
On top of this, the fully scalable, fast and reliable service provided by Amazon, makes it highly attractive to video producers and marketers all over the world.
Amazon offers S3 as a hosting system, with pricing dependent on the geographic location of the datacenter where you store your videos.
Four Point (Use Local file system)
I only use File system when I'm testing my apps, I never use that on production since from my POV, its not so scalable.
In my personal opinion I would use GridFS, but I think you have to analyze the requirements of your application, and so know which Storage Adapter use
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With