I am wondering if there is any disadvantage on defining a column of type nvarchar(max) instead of giving it a (smaller) maximum size.
I read somewhere that if the column value has more than 4?KB the remaining data will be added to an "overflow" area, which is ok.
I'm creating a table where most of the time the text will be of a few lines, but I was wondering if there's any advantage in setting a lower limit and then adding a validation to avoid breaking that limit.
Is there any restriction on the creation of indexes with nvarchar(max) column, or anything that pays for having to add the restriction on the size limit?
Thanks!
I read the answer as "no, there is no disadvantage to using N/VARCHAR(MAX) " because there is additional processing "only if the size exceeds 8000".
In addition, varchar(max) prevents the ability to perform online indexes against the entire table which contains the varchar(max) field. This will significantly impact performance of your system.
nvarchar max is for columns up to 2GB. So essentially it takes up more resources. You are better off using the nvarchar(50) if you know you aren't going to need that much space. each character is about 2 bytes so with 2 GB thats 1 billion characters...
By default, nvarchar(MAX) values are stored exactly the same as nvarchar(4000) values would be, unless the actual length exceed 4000 characters; in that case, the in-row data is replaced by a pointer to one or more seperate pages where the data is stored.
Strictly speaking the MAX
types will always be a bit slower than the non-MAX types, see Performance comparison of varchar(max) vs. varchar(N). But this difference is never visible in practice, where it just becomes noise in the overall performance driven by IO.
Your main concern should not be performance of MAX vs. non-MAX. You should be concerned with the question it will be possible that this column will have to store more than 8000 bytes? If the answer is yes, even by if is a very very unlikely yes, then the answer is obvious: use a MAX type, the pain to convert this column later to a MAX type is not worth the minor performance benefit of non-MAX types.
Other concerns (possibility to index that column, unavailability of ONLINE index operations for tables with MAX columns) were already addressed by Denis' answer.
BTW, the information about the columns over 4KB having remaining data in an overflow area is wrong. The correct information is in Table and Index Organization:
ROW_OVERFLOW_DATA Allocation Unit
For every partition used by a table (heap or clustered table), index, or indexed view, there is one ROW_OVERFLOW_DATA allocation unit. This allocation unit contains zero (0) pages until a data row with variable length columns (varchar, nvarchar, varbinary, or sql_variant) in the IN_ROW_DATA allocation unit exceeds the 8 KB row size limit. When the size limitation is reached, SQL Server moves the column with the largest width from that row to a page in the ROW_OVERFLOW_DATA allocation unit. A 24-byte pointer to this off-row data is maintained on the original page.
So is not columns over 4KB, is rows that don't fit in the free space on the page, and is not the 'remaining', is the entire column.
an index cannot be created on a column over 900 bytes. Columns that are of the large object (LOB) data types ntext, text, varchar(max)
, nvarchar(max), varbinary(max), xml, or image cannot be specified as key columns for an index
you can however use included columns
All data types are allowed except text, ntext, and image. The index must be created or rebuilt offline (ONLINE = OFF) if any one of the specified non-key columns are varchar(max), nvarchar(max), or varbinary(max) data types.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With