Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way storing binary or image files

What is the best way storing binary or image files?

  1. Database System
  2. File System

Would you please explain, why?

like image 700
A.N.M. Saiful Islam Avatar asked Jan 08 '10 15:01

A.N.M. Saiful Islam


People also ask

Are binary files more efficient?

Binary files are more efficient because they use less disk space and because you do not need to convert data to and from a text representation when you store and retrieve data. A binary file can represent 256 values in 1 byte of disk space.

Is it better to store images in database or filesystem?

Generally databases are best for data and the file system is best for files. It depends what you're planning to do with the image though. If you're storing images for a web page then it's best to store them as a file on the server. The web server will very quickly find an image file and send it to a visitor.

How do you store binary data?

Binary data can be stored in a table using the data type bytea or by using the Large Object feature which stores the binary data in a separate table in a special format and refers to that table by storing a value of type oid in your table.

Can binary file store images?

Binary files can be used to store any data; for example, a JPEG image is a binary file designed to be read by a computer system.


2 Answers

There is no real best way, just a bunch of trade offs.

Database Pros:
1. Much easier to deal with in a clustering environment.
2. No reliance on additional resources like a file server.
3. No need to set up "sync" operations in load balanced environment.
4. Backups automatically include the files.

Database Cons:
1. Size / Growth of the database.
2. Depending on DB Server and your language, it might be difficult to put in and retrieve.
3. Speed / Performance.
4. Depending on DB server, you have to virus scan the files at the time of upload and export.


File Pros:
1. For single web/single db server installations, it's fast.
2. Well understood ability to manipulate files. In other words, it's easy to move the files to a different location if you run out of disk space.
3. Can virus scan when the files are "at rest". This allows you to take advantage of scanner updates.

File Cons:
1. In multi web server environments, requires an accessible share. Which should also be clustered for failover.
2. Additional security requirements to handle file access. You have to be careful that the web server and/or share does not allow file execution.
3. Transactional Backups have to take the file system into account.


The above said, SQL 2008 has a thing called FILESTREAM which combines both worlds. You upload to the database and it transparently stores the files in a directory on disk. When retrieving you can either pull from the database; or you can go direct to where it lives on the file system.

like image 71
NotMe Avatar answered Sep 21 '22 23:09

NotMe


Pros of Storing binary files in a DB:

  • Some decrease in complexity since the data access layer of your system need only interface to a DB and not a DB + file system.
  • You can secure your files using the same comprehensive permissions-based security that protects the rest of the database.
  • Your binary files are protected against loss along with the rest of your data by way of database backups. No separate filesystem backup system required.

Cons of Storing binary files in a DB:

  • Depending on size/number of files, can take up significant space potentially decreasing performance (dependening on whether your binary files are stored in a table that is queried for other content often or not) and making for longer backup times.

Pros of Storing binary files in file system:

  • This is what files systems are good at. File systems will handle defragmenting well and retrieving files (say to stream a video file to through a web server) will likely be faster that with a db.

Cons of Storing binary files in file system:

  • Slightly more complex data access layer. Needs its own backup system. Need to consider referential integrity issues (e.g. deleted pointer in database will need to result in deletion of file so as to not have 'orphaned' files in the filesystem).

On balance I would use the file system. In the past, using SQL Server 2005 I would simply store a 'pointer' in db tables to the binary file. The pointer would typically be a GUID.

Here's the good news if you are using SQL Server 2008 (and maybe others - I don't know): there is built in support for a hybrid solution with the new VARBINARY(MAX) FILESTREAM data type. These behave logically like VARBINARY(MAX) columns but behind the scenes, SQL Sever 2008 will store the data in the file system.

like image 29
Emmanuel Avatar answered Sep 22 '22 23:09

Emmanuel