I have variable length character data and want to store in SQL Server (2005) database. I want to learn some best practices about how to choose TEXT SQL type or choose VARCHAR SQL type, pros and cons in performance/footprint/function.
Some Differences Between VARCHAR and TEXTThe VAR in VARCHAR means that you can set the max size to anything between 1 and 65,535. TEXT fields have a fixed max size of 65,535 characters. A VARCHAR can be part of an index whereas a TEXT field requires you to specify a prefix length, which can be part of an index.
Use CHAR when you know you have a fixed number of characters for every entry. Use VARCHAR when you have a variable number of characters for every entry. If you need more storage than VARCHAR can provide, CLOB with UTF-8 encoding or equivalent standard type. NEVER use TEXT as it is non-standard.
In most circumstances, VARCHAR provides better performance, it's more flexible, and can be fully indexed. If you need to store longer strings, use MEDIUMTEXT or LONGTEXT, but be aware that very large amounts of data can be stored in columns of these types.
TEXT
is used for large pieces of string data. If the length of the field exceeed a certain threshold, the text is stored out of row.
VARCHAR
is always stored in row and has a limit of 8000 characters. If you try to create a VARCHAR(x)
, where x > 8000, you get an error:
Server: Msg 131, Level 15, State 3, Line 1
The size () given to the type ‘varchar’ exceeds the maximum allowed for any data type (8000)
These length limitations do not concern VARCHAR(MAX)
in SQL Server 2005, which may be stored out of row, just like TEXT
.
Note that MAX
is not a kind of constant here, VARCHAR
and VARCHAR(MAX)
are very different types, the latter being very close to TEXT
.
In prior versions of SQL Server you could not access the TEXT
directly, you only could get a TEXTPTR
and use it in READTEXT
and WRITETEXT
functions.
In SQL Server 2005 you can directly access TEXT
columns (though you still need an explicit cast to VARCHAR
to assign a value for them).
TEXT
is good:
VARCHAR
is good:
By selecting here I mean issuing any queries that return the value of the column.
By searching here I mean issuing any queries whose result depends on the value of the TEXT
or VARCHAR
column. This includes using it in any JOIN
or WHERE
condition.
As the TEXT
is stored out of row, the queries not involving the TEXT
column are usually faster.
Some examples of what TEXT
is good for:
Some examples of what VARCHAR
is good for:
As a rule of thumb, if you ever need you text value to exceed 200 characters AND do not use join on this column, use TEXT
.
Otherwise use VARCHAR
.
P.S. The same applies to UNICODE
enabled NTEXT
and NVARCHAR
as well, which you should use for examples above.
P.P.S. The same applies to VARCHAR(MAX)
and NVARCHAR(MAX)
that SQL Server 2005+ uses instead of TEXT
and NTEXT
. You'll need to enable large value types out of row
for them with sp_tableoption
if you want them to be always stored out of row.
As mentioned above and here, TEXT
is going to be deprecated in future releases:
The
text in row
option will be removed in a future version of SQL Server. Avoid using this option in new development work, and plan to modify applications that currently usetext in row
. We recommend that you store large data by using thevarchar(max)
,nvarchar(max)
, orvarbinary(max)
data types. To control in-row and out-of-row behavior of these data types, use thelarge value types out of row
option.
If you're using SQL Server 2005 or later, use varchar(MAX)
. The text
datatype is deprecated and should not be used for new development work. From the docs:
Important
ntext
,text
, andimage
data types will be removed in a future version of Microsoft SQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use nvarchar(max), varchar(max), and varbinary(max) instead.
In SQL server 2005 new datatypes were introduced: varchar(max)
and nvarchar(max)
They have the advantages of the old text type: they can contain op to 2GB of data, but they also have most of the advantages of varchar
and nvarchar
. Among these advantages are the ability to use string manipulation functions such as substring().
Also, varchar(max) is stored in the table's (disk/memory) space while the size is below 8Kb. Only when you place more data in the field, it's is stored out of the table's space. Data stored in the table's space is (usually) retrieved quicker.
In short, never use Text, as there is a better alternative: (n)varchar(max). And only use varchar(max) when a regular varchar is not big enough, ie if you expect the string that you're going to store will exceed 8000 characters.
As was noted, you can use SUBSTRING on the TEXT datatype,but only as long the TEXT fields contains less than 8000 characters.
There has been some major changes in ms 2008 -> Might be worth considering the following article when making a decisions on what data type to use. http://msdn.microsoft.com/en-us/library/ms143432.aspx
Bytes per
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With