Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using SQL Server as Image store

Is SQL Server 2008 a good option to use as an image store for an e-commerce website? It would be used to store product images of various sizes and angles. A web server would output those images, reading the table by a clustered ID. The total image size would be around 10 GB, but will need to scale. I see a lot of benefits over using the file system, but I am worried that SQL server, not having an O(1) lookup, is not the best solution, given that the site has a lot of traffic. Would that even be a bottle-neck? What are some thoughts, or perhaps other options?

like image 921
eulerfx Avatar asked Dec 02 '08 20:12

eulerfx


People also ask

Can SQL Server store images?

The IMAGE data type in SQL Server has been used to store the image files. Recently, Microsoft began suggesting using VARBINARY(MAX) instead of IMAGE for storing a large amount of data in a single column since IMAGE will be retired in a future version of MS SQL Server.

Which database is best for storing images?

The large images should be stored in something like AWS S3, HDFS, a Content Delivery Network (CDN), a web server, file server or whatever else would be great a serving up large static objects, in accordance with your use case and budget.

Is it good practice to save image in database?

The answer is "It depends." Certainly it would depend upon the database server and its approach to blob storage. It also depends on the type of data being stored in blobs, as well as how that data is to be accessed.

Is it better to store images in database or filesystem?

Generally databases are best for data and the file system is best for files. It depends what you're planning to do with the image though. If you're storing images for a web page then it's best to store them as a file on the server. The web server will very quickly find an image file and send it to a visitor.


3 Answers

10 Gb is not quite a huge amount of data, so you can probably use the database to store it and have no big issues, but of course it's best performance wise to use the filesystem, and safety-management wise it's better to use the DB (backups and consistency).

Happily, Sql Server 2008 allows you to have your cake and eat it too, with:

The FILESTREAM Attribute

In SQL Server 2008, you can apply the FILESTREAM attribute to a varbinary column, and SQL Server then stores the data for that column on the local NTFS file system. Storing the data on the file system brings two key benefits:

  • Performance matches the streaming performance of the file system.
  • BLOB size is limited only by the file system volume size.

However, the column can be managed just like any other BLOB column in SQL Server, so administrators can use the manageability and security capabilities of SQL Server to integrate BLOB data management with the rest of the data in the relational database—without needing to manage the file system data separately.

Defining the data as a FILESTREAM column in SQL Server also ensures data-level consistency between the relational data in the database and the unstructured data that is physically stored on the file system. A FILESTREAM column behaves exactly the same as a BLOB column, which means full integration of maintenance operations such as backup and restore, complete integration with the SQL Server security model, and full-transaction support.

Application developers can work with FILESTREAM data through one of two programming models; they can use Transact-SQL to access and manipulate the data just like standard BLOB columns, or they can use the Win32 streaming APIs with Transact-SQL transactional semantics to ensure consistency, which means that they can use standard Win32 read/write calls to FILESTREAM BLOBs as they would if interacting with files on the file system.

In SQL Server 2008, FILESTREAM columns can only store data on local disk volumes, and some features such as transparent encryption and table-valued parameters are not supported for FILESTREAM columns. Additionally, you cannot use tables that contain FILESTREAM columns in database snapshots or database mirroring sessions, although log shipping is supported.

like image 192
Vinko Vrsalovic Avatar answered Oct 31 '22 09:10

Vinko Vrsalovic


Check out this white paper from MS Research (http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-2006-45)

They detail exactly what you're looking for. The short version is that any file size over 1 MB starts to degrade performance compared to saving the data on the file system.

like image 41
Joel.Cogley Avatar answered Oct 31 '22 09:10

Joel.Cogley


I doubt that O(log n) for lookups would be a problem. You say you have 10GB of images. Assuming an average image size of say 50KB, that's 200,000 images. Doing an indexed lookup in a table for 200K rows is not a problem. It would be small compared to the time needed to actually read the image from disk and transfer it through your app and to the client.

It's still worth considering the usual pros and cons of storing images in a database versus storing paths in the database to files on the filesystem. For example:

  • Images in the database obey transaction isolation, automatically delete when the row is deleted, etc.
  • Database with 10GB of images is of course larger than a database storing only pathnames to image files. Backup speed and other factors are relevant.
  • You need to set MIME headers on the response when you serve an image from a database, through an application.
  • The images on a filesystem are more easily cached by the web server (e.g. Apache mod_mmap), or could be served by leaner web server like lighttpd. This is actually a pretty big benefit.
like image 41
Bill Karwin Avatar answered Oct 31 '22 08:10

Bill Karwin