Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing image in DB vs filesystem for user uploaded images in website

I am building a website where users will be allowed to upload images . There is also restriction on maximum amount of space each user can use .

I have two ideas in mind .

  1. To store image in a NoSQL db like mongoDB using GridFS .
  2. To store the image in File system and have a path stored in DB .

Which among the above is better? and Why?

like image 942
Ahmed Shabib Avatar asked Apr 24 '14 08:04

Ahmed Shabib


People also ask

What's the best way to store user uploaded images?

Store the images as a file in the file system and create a record in a table with the exact path to that image. Or, store the image itself in a table using an "image" or "binary data" data type of the database server.

Should I store images in DB?

Storing images in a database table is not recommended. There are too many disadvantages to this approach. Storing the image data in the table requires the database server to process and traffic huge amounts of data that could be better spent on processing it is best suited to.

What is the best way to store image in database?

Answers. Hi, These two ways are both ok. My suggestion is to store images into sql database if there are no much more images, otherwise, it's better store them into file system because you can store them no matter how many there are.


1 Answers

sigh why does everybody jump to GridFS?

Depending on the size of the images and the exact use case, I'd recommend to store the images directly in the DB (not via GridFS). Here's why:

File System

  • Storing the images in the file system is proven to work well, but it's not trivial
  • You will need a different backup system, failover, replication, etc. This can be tricky DevOps-wise
  • You will need to create a smart directory structure which is a leaky abstraction, because different file systems have very different characteristics. Some have no problem storing 16k files in one folder, others start to choke at a mere 1k files. A common approach is to use a convention like af/2c/af2c2ab3852df91.jpg, where the folders af and 2c are inferred from the file name (which itself might be a hash of the content for deduplication purposes).

GridFS

GridFS is made for storing large files, and for storing files in a very similar way to a file system. That comes with some disadvantages:

  • For every file, you will need one fs.file and one fs.chunk document. Chunking is totally required for large files, but if your files are below 256k on average, there's no real chunking going on (default chunk size is 256k). So when storing small files in GridFS, you get the overhead without the advantage. Bad deal. It also requires two queries instead of one.
  • It imposes a certain structure on your collection, for instance to have a 'file name'. It depends on the use case, but I often choose to use a hash as the id and store the hash in the user, for example. That deduplicates, is easy to implement, aligns beautifully with caching and doesn't require coming up with any convention. It's also very efficient because the index is a byte array.

Things might look different if you're operating a site for photographers where they can upload their RAW files or large JPEGs at 10MB. In that case, GridFS is probably a good choice. For storing user images, thumbnails, etc., I'd simply throw the image in its own document flat.

like image 193
mnemosyn Avatar answered Oct 01 '22 07:10

mnemosyn