Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing Documents as Blobs in a Database - Any disadvantages?

People also ask

Should you store BLOBs in a database?

Storing BLOB data such as images, audio files, and executable files in the database with typical text and numeric data lets you keep all related information for a given database entity together. And this approach enables easy search and retrieval of the BLOB data; you simply query its related text information.

What is blob storage in database?

Azure Blob storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data.

Why does storing a BLOB in a database not create a security problem?

But with BLOB in database, you only have to be able to connect to database, no matter where you are. If you store it in file and file is replaced, removed or no longer accessible, your database would never know - in effect, you cannot guarantee integrity.


When your DB grows bigger and bigger it will become harder to backup. Restoring a backup of a table with over 100 GB of data is not something that makes you happy.

Another thing that get is that all the table management functions get slower and slower as the dataset grows.
But this can be overcome by making your data table just contain 2 fields: ID and BLOB.

Retrieving data (by primary key) will likely only become a problem long after you hit a wall with backing up the dataset.


The main disadvantage that I often hear of using blobs is that, above a certain size, the file system is much more efficient at storing and retrieving large files. It sounds like you've already taken this in to account by your list of requirements.

There's a good reference (PDF) here that covers the pros and cons of blobs.


From my experience, some issues were:

  1. speed vs having files on the file system.

  2. caching. IMO the web server will do a better job of caching static contents. The DB will do a good job too, but if the DB is also handing all sorts of other queries, don't expect those large documents to stay cached for long. You essentially have to transfer the files twice. Once from the DB to the Web server, and then web server to client.

  3. Memory constraints. At my last job we had an 40MB PDF in the database, and kept getting Java OutOfMemoryErrors in the log file. We eventually realized that the entire 80MB PDF was read into the heap not just once, but TWICE thanks to a setting in Hibernate ORM (if an object is mutable, it makes a copy for editing in memory). Once the PDF was streamed back to the user, the heap was cleaned up, but it was a big hit to suck 80MB out of the heap at once just to stream a document. Know your code and how memory is being used!

Your web server should be able to handle most of your security concerns, but if documents are small and the DB isn't already under a big load, then I don't really see a big issue with having them in the DB.


I've just started researching SQL Server 2008's FILESTREAMing for BLOBs and have run across a HUGE limitation (IMO)--it only works with integrated security. If you don't use Windows Authentication to connect to the DB server, you're unable to read/write the BLOBs. Many application environments can't use windows authentication. Certainly not in heterogeneous environments.

A better solution for storing BLOBs must exist. What are the best practices?


This article covers most of the issues. If you are using SQL Server 2008, check out the use of the new FILESTREAM type as discussed by Paul Randal here.


It depends on the databasetype. Oracle or SQLServer? Be aware of one disadvantage - restore of a single document.