Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using varchar(MAX) vs TEXT on SQL Server

I just read that the VARCHAR(MAX) datatype (which can store close to 2GB of char data) is the recommended replacement for the TEXT datatype in SQL Server 2005 and Next SQL SERVER versions.

If I want to search inside a column for any string, which operation is quicker?

  1. Using a the LIKE clause against a VARCHAR(MAX) column?

    WHERE COL1 LIKE '%search string%'

  2. Using the TEXT column and put a Full Text Index/Catalog on this column, and then search using the CONTAINS clause?

    WHERE CONTAINS (Col1, 'MyToken')

like image 576
user85116 Avatar asked May 07 '09 13:05

user85116


People also ask

Why you shouldn't use VARCHAR Max?

DO NOT use VARCHAR(MAX) just because it can be. Use it only if the data to be stored can be more than 8,000 bytes.

Should I use TEXT or VARCHAR in SQL?

Use CHAR when you know you have a fixed number of characters for every entry. Use VARCHAR when you have a variable number of characters for every entry. If you need more storage than VARCHAR can provide, CLOB with UTF-8 encoding or equivalent standard type. NEVER use TEXT as it is non-standard.

Should you always use VARCHAR Max?

It is always preferable to use varchar(N), and if you know the size will not vary, then char(N). The MAX types do not and cannot support most of the native SQL features so you cannot add indexes, perform joins nor do effective searches on those types.

Which is better TEXT or VARCHAR?

A VARCHAR can be part of an index whereas a TEXT field requires you to specify a prefix length, which can be part of an index. VARCHAR is stored inline with the table (at least for the MyISAM storage engine), making it potentially faster when the size is reasonable.


5 Answers

The VARCHAR(MAX) type is a replacement for TEXT. The basic difference is that a TEXT type will always store the data in a blob whereas the VARCHAR(MAX) type will attempt to store the data directly in the row unless it exceeds the 8k limitation and at that point it stores it in a blob.

Using the LIKE statement is identical between the two datatypes. The additional functionality VARCHAR(MAX) gives you is that it is also can be used with = and GROUP BY as any other VARCHAR column can be. However, if you do have a lot of data you will have a huge performance issue using these methods.

In regard to if you should use LIKE to search, or if you should use Full Text Indexing and CONTAINS. This question is the same regardless of VARCHAR(MAX) or TEXT.

If you are searching large amounts of text and performance is key then you should use a Full Text Index.

LIKE is simpler to implement and is often suitable for small amounts of data, but it has extremely poor performance with large data due to its inability to use an index.

like image 174
Robin Day Avatar answered Oct 05 '22 05:10

Robin Day


For large text, the full text index is much faster. But you can full text index varchar(max)as well.

like image 38
Joel Coehoorn Avatar answered Oct 05 '22 05:10

Joel Coehoorn


You can't search a text field without converting it from text to varchar.

DECLARE @table TABLE (a text)
INSERT INTO @table VALUES ('a')
INSERT INTO @table VALUES ('a')
INSERT INTO @table VALUES ('b')
INSERT INTO @table VALUES ('c')
INSERT INTO @table VALUES ('d')


SELECT *
FROM @table
WHERE a = 'a'

This will give you the error:

The data types text and varchar are incompatible in the equal to operator.

Whereas this does not:

DECLARE @table TABLE (a varchar(max))

Interestingly, LIKE still works, i.e.

WHERE a LIKE '%a%'
like image 25
DForck42 Avatar answered Oct 05 '22 04:10

DForck42


  • Basic Definition

TEXT and VarChar(MAX) are non-Unicode large variable length character data type, which can store maximum of 2,147,483,647 non-Unicode characters (i.e. maximum storage capacity is: 2GB).

  • Which one to Use?

As per MSDN, Microsoft is suggesting to avoid using the TEXT datatype and it will be removed in a future version of SQL Server. VarChar(MAX) is the suggested data type for storing large string values instead of the TEXT data type.

  • In-Row or Out-of-Row Storage

Data of a TEXT type column is stored out-of-row in a separate LOB data pages. The row in the table data page will only have a 16 byte pointer to the LOB data page where the actual data is present. The data of a VarChar(MAX) type column is stored in-row if it is less than or equal to 8000 bytes. If the value of a VarChar(MAX) column is greater than 8000 bytes, then the VarChar(MAX) column value is stored in a separate LOB data pages and row will only have a 16 byte pointer to the LOB data page where the actual data is present. So "in-row" VarChar(MAX) is good for searches and retrieval.

  • Supported/Unsupported Functionalities

Some string functions, operators and constructs don't work on a TEXT type column, but they do work on a VarChar(MAX) type column.

  1. = Equal to operator on VarChar(MAX) type column
  2. GROUP BY clause on VarChar(MAX) type column
  • System IO Considerations

As we know, the VarChar(MAX) type column values are stored out-of-row only when the length of the value is greater than 8000 bytes or there is not enough space in the row, otherwise it will store it in-row. So if most of the values stored in the VarChar(MAX) column are large and stored out-of-row, the data retrieval behavior will almost similar to a TEXT type column.

If most of the values stored in VarChar(MAX) type columns are small enough to store in-row, then retrieval of data where LOB columns are not included requires more data pages to be read, since the LOB column value is stored in-row in the same data page where the non-LOB column values are stored. But if the SELECT query includes a LOB column, then it requires less pages to be read for the data retrieval compared to the TEXT type columns.

Conclusion

Use VarChar(MAX) data type rather than TEXT for better performance.

Source

like image 27
Somnath Muluk Avatar answered Oct 05 '22 04:10

Somnath Muluk


If using MS Access (especially older versions like 2003) you are forced to use TEXT datatype on SQL Server as MS Access does not recognize nvarchar(MAX) as a Memo field in Access, whereas TEXT is recognized as a Memo-field.

like image 41
Klaus Oberdalhoff Avatar answered Oct 05 '22 05:10

Klaus Oberdalhoff