Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server FILESTREAM limitation

I am looking at FILESTREAM attribute in SQL Server to store files in it. I understand it stores the files on hard drive and stores the file pointer/path information in DB. Also, maintains transactional consistency in the process.

There also seems to be a limitation "FILESTREAM data can be stored only on local disk volumes" for the FILESTREAM attribute.

If i anticipate my web app to store 200,000 images of 1-2mb each, i would require around 200gb of hard drive space to store the images. Since, the FILESTREAM requires all data to be stored only on local disk as per the limitation, it would be impossible to store millions of files on a single hard drive, as the storage requirements would be extremely large.

Is my understanding of the limitation correct or am i missing anything here?

If this limitation is correct, i would instead store it in db as plain blob and cluster my DB for increase in storage requirements, which doesn't seem to be possible with FILESTREAM.

Please share your thoughts!

UPDATED:
Few more questions regarding FILESTREAM:-

  1. How to handle data recovery in case of data container corruption?
  2. Can we just backup the DB without the file system data? [assuming data is in SAN, which need not be moved]
  3. I would like to back up or restore the DB and just remap the filegroup path information [that maps to SAN]. Is this possible?
like image 545
pencilslate Avatar asked Sep 14 '09 03:09

pencilslate


People also ask

Should I enable Filestream SQL Server?

If you install your own SQL Server Express instance, you must enable FILESTREAM. If you upgrade from an embedded database (14.3 MP1 and earlier) to 14.3 RU1 and later, you do not need to enable the FILESTREAM; the upgrade wizard or configuration wizard enables FILESTREAM for you automatically.

What is the difference between Filestream and FileTable?

FileStream and FileTable are features of SQL Server for storing unstructured data in SQL Server alongside other data. The FileStream feature stores unstructured data in the file system and keeps a pointer of the data in the database, whereas FileTable extends this feature even further allowing non-transactional access.

What is Filestream access level in SQL Server?

Filestream integrates the Database Engine with your NTFS file system by storing BLOB data as files on the file system and allowing you to access this data either using T-SQL or Win32 file system interfaces to provide streaming access to the data.

Does SQL Server Express Support Filestream?

SQL Server Express supports FILESTREAM. The 10-GB database size limit does not include the FILESTREAM data container.


2 Answers

FILESTREAM does not actually require local storage, just not SMB network storage. An iSCSI or Fiber Channel SAN works fine to store FILESTREAM data. You can also have multiple filestream file groups per table, essentially partitioning your data. If you are strictly targeting sql server 2008 there is very little reason to not use filestream for large binary data. There is a Microsoft whitepaper describing filestream partitioning here.

like image 68
Jeff Mc Avatar answered Sep 20 '22 22:09

Jeff Mc


On the local disk volume requirement

Do not take local too literally. While it is indeed a requirement that MSSQL should "see" the filegroup(s) associated with FILESTREAM data as local drives, this storage is often supplied by way of NAS or other storage technologies which trick Windows into thinking these are local NTFS disks (by way of iSCSI and such). This is particularly true with enterprise applications, with the level of space requirement you mention.

On using FILESTREAM at all...

Do weigh the pros and cons carefully. Your question mentions rather big (MB-size) images (I'm assuming graphic images, not logic images of sorts), which implies a rather atomic use of them. A file server setup would require external (to SQL server) management and synchronization, but this seems to be a relatively small cost to pay to keep your freedom, not so much vis-a-vis SQL Server / Microsoft, but also your ability to move things around more easily for scaling / bandwidth purposes.

like image 32
mjv Avatar answered Sep 19 '22 22:09

mjv