Is it a good idea to store large amounts of text (eg html pages) inside your SQL database? Or is it a better idea to store it as html files in the filesystem?
The same goes for images - is it a good idea to store image data in the database or better to put them on disk?
Will storing large amounts of data cause me performance problems for example? What are the pros and cons of each method of storage?
In terms of the size of data, in this case I am looking in the region of "a few pages" of HTML and images less than about 500kb in size (probably a lot smaller though). Enough to produce your average article/blog entry/etc scale web page.
If you want to store large amounts of text in a SQL database, then you want to use either a varchar(max) or a nvarchar(max) column to store that data. In case you don't know the difference, nvarchar will support Unicode characters.
The IMAGE data type in SQL Server has been used to store the image files. Recently, Microsoft began suggesting using VARBINARY(MAX) instead of IMAGE for storing a large amount of data in a single column since IMAGE will be retired in a future version of MS SQL Server.
IMAGE is a variable-length data type that can store binary data. IMAGE can hold up to 2GB of data.
Introduction to SQL TEXT. TEXT is a variable width character string data type that supports non-Unicode data in the code page of a SQL database server and with a maximum string length of 2,147,483,647.
Storing binary data (documents, images etc) in the database has some advantages.
You can commit the update of the document itself in the same transaction as the information (name, date etc) you want to store about the document. This means you don't have to worry about writing your own two-phase commit (although ISTR that SQL Server 2008 has a solution for this).
You can back up the whole lot (documents and metadata) at once, without worrying about having to synchronise the database with the file system
You can deliver documents very simply over .NET web services, since they come straight out into DataTables, and are serialised effortlessly just by putting the DataTables into a DataSet and passing it.
You can apply database security to the objects, as to the rest of your data, and not have to worry about network file permissions.
It does have some disadvantages too:
Backups can get very large
The size of the binary object in the database can be quite a bit larger than the file it originally came from, and therefore in a client-server environment, it can increase the time taken to open them across the network.
Depending on the application, you might need to consider the load on the database server if it has to serve up a lot of large documents.
All that said, it's a technique I use extensively, and it works very well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With