I have to store file paths in a DB field (/tmp/aaa/bbb
, C:\temp\xxx\yyy
, etc.). I can't really tell how long they could be.
Given this http://en.wikipedia.org/wiki/Comparison_of_file_systems and that http://msdn.microsoft.com/en-us/library/aa365247.aspx, depending on the file system there could be theoretically no length limit for a path.
I guess that defining this field as a LONGBLOB
or VARCHAR(very high value)
wouldn't be wise. I've thought about something like VARCHAR(1024)
which should be suitable for most frequent (even if not all) cases, and not too big as a DB field. What would you recommend ?
Thanks.
Use the appropriate length for the data you intend to support. Since you're using SQL Server you should use nvarchar(260)
as the upper limit for storing path names, since that is the specification limit for typical Windows machines. Under certain circumstances you can create paths longer than that, however Windows Explorer will tend to have problems handling them. SQL Server cannot handle filenames longer than 260 characters. This includes SQL Server on Linux.
I can prove SQL Server uses an nvarchar(260)
column internally to store SQL Server Database filenames, with the path included. Checking the definition of the sys.master_files
view, we see the following T-SQL:
CREATE VIEW sys.master_files AS
SELECT
database_id = f.dbid,
file_id = f.fileid,
file_guid = f.fileguid,
type = f.filetype,
type_desc = ft.name,
data_space_id = f.grpid,
name = f.lname,
physical_name = f.pname,
state = convert(tinyint, case f.filestate -- Map enum EMDFileState to AvailablityStates
when 0 then 0 when 10 then 0 -- ONLINE
when 4 then 7 -- DEFUNCT
when 5 then 3 when 9 then 3 -- RECOVERY_PENDING
when 7 then 1 when 8 then 1 when 11 then 1 -- RESTORING
when 12 then 4 -- SUSPECT
else 6 end), -- OFFLINE
state_desc = st.name,
f.size,
max_size = f.maxsize,
f.growth,
is_media_read_only = sysconv(bit, f.status & 8), -- FIL_READONLY_MEDIA
is_read_only = sysconv(bit, f.status & 16), -- FIL_READONLY
is_sparse = sysconv(bit, f.status & 256), -- FIL_SPARSE_FILE
is_percent_growth = sysconv(bit, f.status & 32), -- FIL_PERCENT_GROWTH
is_name_reserved = sysconv(bit, case f.filestate when 3 then 1 else 0 end), -- x_efs_DroppedReusePending
create_lsn = GetNumericLsn(f.createlsn),
drop_lsn = GetNumericLsn(f.droplsn),
read_only_lsn = GetNumericLsn(f.readonlylsn),
read_write_lsn = GetNumericLsn(f.readwritelsn),
differential_base_lsn = GetNumericLsn(f.diffbaselsn),
differential_base_guid = f.diffbaseguid,
differential_base_time = nullif(f.diffbasetime, 0),
redo_start_lsn = GetNumericLsn(f.redostartlsn),
redo_start_fork_guid = f.redostartforkguid,
redo_target_lsn = GetNumericLsn(f.redotargetlsn),
redo_target_fork_guid = f.forkguid,
backup_lsn = GetNumericLsn(f.backuplsn),
credential_id = cr.credential_id
FROM sys.sysbrickfiles f
LEFT JOIN sys.syspalvalues st ON st.class = 'DBFS' AND st.value = f.filestate
LEFT JOIN sys.syspalvalues ft ON ft.class = 'DBFT' AND ft.value = f.filetype
LEFT JOIN sys.credentials cr ON f.pname LIKE cr.name + N'%' COLLATE database_default
WHERE f.dbid < 0x7fff -- consistent with sys.databases
AND f.pruid = 0
AND f.filestate NOT IN (1, 2) -- x_efs_Dummy, x_efs_Dropped
AND has_access('MF', 1) = 1
Microsoft Docs for sys.master_files says this about the physical_name
column:
physical_name nvarchar(260) Operating-system file name.
But let's not trust that. We see the physical file name is referenced as physical_name = f.pname
. And the table alias "f" points to FROM sys.sysbrickfiles f
. Therefore, SQL Server stores the filename in sys.sysbrickfiles, which is an internal table that is only visible from the Dedicated Administrator Connection, or DAC as its frequently known. Connecting to the DAC, and generating a temp table from the output of sys.sysbrickfiles
, we see the following:
CREATE TABLE #sysbrickfiles
(
brickid int NOT NULL
, dbid int NOT NULL
, pruid int NOT NULL
, fileid int NOT NULL
, grpid int NOT NULL
, status int NOT NULL
, filetype tinyint NOT NULL
, filestate tinyint NOT NULL
, size int NOT NULL
, maxsize int NOT NULL
, growth int NOT NULL
, lname nvarchar(128) NOT NULL
, pname nvarchar(260) NOT NULL
, createlsn binary(10) NULL
, droplsn binary(10) NULL
, fileguid uniqueidentifier NULL
, internalstatus int NOT NULL
, readonlylsn binary(10) NULL
, readwritelsn binary(10) NULL
, readonlybaselsn binary(10) NULL
, firstupdatelsn binary(10) NULL
, lastupdatelsn binary(10) NULL
, backuplsn binary(10) NULL
, diffbaselsn binary(10) NULL
, diffbaseguid uniqueidentifier NULL
, diffbasetime datetime NOT NULL
, diffbaseseclsn binary(10) NULL
, redostartlsn binary(10) NULL
, redotargetlsn binary(10) NULL
, forkguid uniqueidentifier NULL
, forklsn binary(10) NULL
, forkvc bigint NOT NULL
, redostartforkguid uniqueidentifier NULL
);
As you can see, the pname
column is indeed defined as nvarchar(260)
.
Also, if we attempt to create a database using a filename that is longer than 260 characters, we see an error is returned:
Msg 103, Level 15, State 3, Line 7
The file that starts with 'F:\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARGH.mdf' is too long. Maximum length is 259.
Using anything other than an nvarchar(260)
column to store filenames in SQL Server is both wasteful, and creates technical debt.
The length of a column is extremely important performance-wise. Column length directly affects:
Customer
table where the key is the CustomerName
column, defined as varchar(500)
. Every table that references customers will now require a 500-byte CustomerName
column. If that column was defined as a varchar(100)
instead, every query referencing those columns will save 200 bytes per row in memory and disk I/O.(MAX)
data types, which can severely limit performance.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With