Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

nvarchar(max) versus nvarchar(n) in table variable columns

I do a lot of work with table variables before finally presenting result sets to the user. For example, I might pull in a whole load of columns from many different tables like this:

DECLARE @tmp TABLE
(
ID int,
username nvarchar(50),  -- data taken from tbl_Users
site nvarchar(50),      -- data taken from tbl_Sites
region nvarchar(100),   -- data taken from tbl_Regions
currency nvarchar(5)    -- data taken from tbl_Currencies
)

I spend a lot of time looking through the Object Explorer to ensure that the data length of the columns in correct (matches the original tables). Sometimes if I change the table schema but don't update all the procedures, I might get caught out with a truncation error.

Is there any issue with taking a lazy approach and just doing this:

DECLARE @tmp TABLE
(
ID int,
username nvarchar(max),
site nvarchar(max),
region nvarchar(max),
currency nvarchar(max)
)

Does nvarchar(max) actually use up any more memory or this the allocated memory based on the data size? Are there any other gotchas?

Please note that I am aware of third-party tools for jumping to the definition, but that's not what I'm asking.

UPDATE

The duplicate question has value, but the question is not identical IMHO. The duplicate revolves around the design of actual tables, not table variables. However, there is some merit in the answers, namely:

  • nvarchar(max) vs nvarchar(8000) are no different in resource usage until 8000+ data lengths
  • Business logic layers rely on structure and meaningful data, so specifying a column size that compliments the original provides value

In that sense, it would seem that it is fine to use nvarchar(max) in table variables instead of nvarchar(n), but it has a reliability and performance risk in some environments. If you think this should be deleted then fair enough (but please stop arguing I appreciate all input!)

like image 965
EvilDr Avatar asked Dec 19 '22 19:12

EvilDr


1 Answers

I can't think of any reason nvarchar(max) in a table variable would have any pitfalls that are different from using nvarchar(max) in a table (the downsides of which are explained in this question), except for the pitfalls that are due to the differences between table variables and temp/permanent tables in the first place (e.g. terrible statistics, no secondary indexes, etc). Martin Smith draws a great comparison between table variables and temp tables here.

You still have to worry about certain issues, for example if you are using ancient technology like classic ASP/ADO, you may find that you have to list MAX columns last to ensure the results are accurate. I explained this here back in 2000, before MAX types were introduced; but they have the same problems in those old providers as TEXT/NTEXT. Highly unlikely you are using that technology, but thought I would mention it just in case.

I'd suggest, though, that you just take the hit and script the correct types when you're writing the code. They are easy to derive from the metadata (e.g. sys.columns or right-clicking the table and saying script as > create to > clipboard) and doing so will prevent any problems (such as the one @JC. mentioned above regarding mismatched lengths, possibly leading to overflow/truncation).

Also, as I meant to imply earlier, if you have any substantial number of rows (thanks @Stuart), you should consider #temp tables instead. I still think that whatever you choose should be well-defined. The only benefit to using MAX for everything in this scenario is that it allows you to be lazy, while opening you up to a whole lot of risk. You write your code once, but your users run it countless time. Spend the extra couple of minutes to make your data types correct, even if it means you have to correct it twice later should the schema change.

And as for the memory usage of nvarchar(max), yes, this could change your performance. See this blog post for some evidence. Partial relevant snippet, with my spelling/grammar corrections:

So if you are sure that the length of your nvarchar column will be less than 8000, then do not define the column as nvarchar(max) but rather define it as nvarchar(fixedlength) wherever possible. The main advantages I see for using fixed length are:

Memory grant could be a big issue where the server is already memory starved. As expected row size is more the optimizer will estimate more memory grant and this value will be much higher than actually required and this would be a waste of a resource as precious as memory. If you have couple of queries which are using nvarchar(max) column and sorting is needed then server might have memory related issues. This could be a big perf issue.

The other advantages he listed had to do with indexes. Not an issue with table variables anyway, since you can't create secondary indexes (before SQL Server 2014).

But once again, this potential problem is really no different no matter which type of table structure you are pulling the data from - temp table, table variable, permanent table, etc.

like image 116
Aaron Bertrand Avatar answered Dec 22 '22 08:12

Aaron Bertrand