Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to associate a file with a piece of data?

Tags:

file

database

I have an application that creates records in a table (rocket science, I know). Users want to associate files (.doc, .xls, .pdf, etc...) to a single record in the table.

  • Should I store the contents of the file(s) in the database? Wouldn't this bloat the database?

  • Should I store the file(s) on a file server, and store the path(s) in the database?

What is the best way to do this?

like image 653
Aaron Daniels Avatar asked Mar 05 '09 21:03

Aaron Daniels


People also ask

What is flat file in DBMS?

A flat file is a collection of data stored in a two-dimensional database in which similar yet discrete strings of information are stored as records in a table. The columns of the table represent one dimension of the database, while each row is a separate record.


1 Answers

I think you've accurately captured the two most popular approaches to solving this problem. There are pros and cons to each:

Store the Files in the DB

Most rbms have support for storing blobs (or binary file data, .doc, .xls, etc.) in a db. So you're not breaking new ground here.

Pros

  • Simplifies Backup of the data: you backup the db you have all the files.
  • The linkage between the metadata (the other columns ABOUT the files) and the file itself is solid and built into the db; so its a one stop shop to get data about your files.

Cons

  • Backups can quickly blossom into a HUGE nightmare as you're storing all of that binary data with your database. You could alleviate some of the headaches by keeping the files in a separate DB.
  • Without the DB or an interface to the DB, there's no easy way to get to the file content to modify or update it.
  • In general, its harder to code and coordinate the upload and storage of data to a DB vs. the filesystem.

Store the Files on the FileSystem

This approach is pretty simple, you store the files themselves in the filesystem. Your database stores a reference to the file's location (as well as all of the metadata about the file). One helpful hint here is to standardize your naming schema for the files on disk (don't use the file that the user gives you, create one on your own and store theirs in the db).

Pros

  • Keeps your file data cleanly separated from the database.
  • Easy to maintain the files themselves (if you need to change out the file or update it), you do so in the file system itself. You can just as easily do it from the application as well via a new upload.

Cons

  • If you're not careful, your database about the files can get out of sync with the files themselves.
  • Security can be an issue (again if you're careless) depending on where you store the files and whether or not that filesystem is available to the public (via the web I'm assuming here).

At the end of the day, we chose to go the filesystem route. It was easier to implement quickly, easy on the backup, pretty secure once we locked down any holes and streamed the file out (instead of just serving directly from the filesystem). Its been operational in pretty much the same format for about 6 years in two different government applications.

J

like image 192
Jay Stevens Avatar answered Nov 03 '22 00:11

Jay Stevens