Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do we really need a table in database to store file path of image if images are stored in file system? [closed]

We all store images either in database or in file system. If we store images in file system, we store a path of the image in database. We have been doing so all the time.

One of my colleague thought about storing images in file system, each user having a separated folder, named after their username, for storage of their images without storing a file path in database. For example, When a user sends a request to show his/her images then, we search a folder named after that user in the file system and showing all images of that folder. When a user uploads a image then it is stored in file system under his/her folder. Users cannot change their username and have no access right to change folder name.

Can we really omit database table for image path? Is it ok to design a system in this way? What are the possible issues associated with this kind of design?

like image 386
IamGhale Avatar asked Sep 01 '14 11:09

IamGhale


2 Answers

Can we really omit database table for image path?

Of course you can but (see later) I'd consider this solution only for very small databases.

Is it ok to design a system in this way?

IMO it's not unless you're working with a very small amount of data, with pretty fixed requirements and easy logic around them.

What are the possible issues associated with this kind of design?

Here long text comes. Common issues I may think about are (at least) these:

  1. If you use a technology like FILESTREAM (or equivalent, according to your environment) tricky operations like delete and rename will be handled correctly by database engine. Doing by hand (with or without file path stored in a column) will make your code weaker unless you carefully design such operations (primary because of concurrency). What's easier to write with a path column is the script to find orphans. This may (or not) be an issue (as you said an user is forever) but it's something you have to consider.
  2. For security reasons you may setup your storage space to be read/write but not to be browsed. If you don't have a path column you need to make your storage browseable.
  3. If you need paths in your queries then you need that column (for example to count how many images each user has, average number of images per user or to prepare a list without accessing disk).
  4. With such column you'll be safer also from concurrency point of view. If one user is accessing the list (for example reading files) someone else may decide to delete them. If access is coordinated with a lock at table/record level then it'll work naturally but if they're unrelated then this has to be done in your code (don't underestimate this).
  5. If space to store images increases then you can easily add new disks, this it's easy to handle if you don't have to search for proper path but it's stored inside DB. You may also have different disks (with different speeds) for different users and you won't pollute your code with such details (image browser will ask paths, it's always unaware of such business stuff). Of course you may do it in code but then you'll put such logic everywhere you need to gather such list, point here is code maintainability.
  6. Path stored in a database column can be a virtual path (as a plus for what I said in point 5) and it may be resolved with data from other tables. Everything can be done in SQL and your code will just see a path. If DB has not such information then this calculation has to be done in code (and it'll be harder to change because it's spanned). Again point is maintainability.
  7. To reuse an image for multiple users you can simply duplicate its path (assuming it's read-only).
  8. Images can be made public (to be visible to many users). This information has to stay in DB but if you don't have a file path then it'll be harder.
  9. Images may not even be stored locally but in another server (through HTTP). It's an easy change if you have paths in DB but it's a pain if such code is in your presentation.

Please note that points from 5 to 9 imply some kind of business logic. It may be in the database layer (in SQL or with external code - for example .NET for MS SQL Server) or in your Data Access Layer outside DB. The true difference here is that with such column you have flexibility to put such logic where you prefer (according to your specific architecture and also move it up if you increase layers). If you don't have such column then you can't simply put such logic in DB (and this is especially bad for small 2 tiers applications).

Here a simple requirement that will convince me to adopt that path column:

Store images bigger than 10 MB on path \\BIGDISK\IMAGES.
Images smaller than 10 MB are on D:\IMAGES.
Premium users always stores on D:\IMAGES.
Trial users will store on remote server http://example.com/very_slow_storage).

Of course you can do it in your code but it's easier in SQL and it's (more important) in one well-known point, not spanned across your presentation logic (to insert images, to upload them, to get the list, to calculate statistics, to build a preview...). Even more important: it's easy to change, no need to update (and to deploy/merge) configuration files or (even worse) with code changes.

If you answer "I don't care" for all these points and you're not planning any change in future then IMO you can drop such column in your database.

Of course it's a design change that you may reconsider every time you have a new requirement (or when you find your presentation code is aware of details it should really ignore - such as where an image should be stored).

like image 93
Adriano Repetti Avatar answered Oct 22 '22 12:10

Adriano Repetti


The point of a column in your database is for metadata, and for concurrency. Filesystems are relatively slow at searches compared to databases. So would you rather want your web server searching the disk or a database to find it's files? I'd prefer the database.

On top of that, you can associate other metadata about the files.. (upload time, user who uploaded, thumbnaails associated, permissions on who can access it, etc..)

like image 5
Erik Funkenbusch Avatar answered Oct 22 '22 11:10

Erik Funkenbusch