Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the performance penalty of XML data type in SQL Server when compared to NVARCHAR(MAX)?

I have a database that is going to keep log entries.

One of the columns in the log table contains serialized (to XML) objects and a guy on my team proposed to go with XML data type rather than NVARCHAR(MAX). This table will have logs kept "forever" (archiving some very old entries may be considered in the future).

I'm a little worried about the CPU overhead, but I'm even more worried that DB can grow faster (FoxyBOA from the referenced question got 70% bigger DB when using XML).

I have read this question and it gave me some ideas but I am particularly interested in clarification on whether the database size increases or decreases.

Can you please share your insight/experiences in that matter.

BTW. I don't currently have any need to depend on XML features within SQL Server (there's nearly zero advantage to me in the specific case). Ocasionally log entries will be extracted, but I prefer to handle the XML using .NET (either by writing a small client or using a function defined in a .NET assembly).

like image 404
Piotr Owsiak Avatar asked May 21 '10 15:05

Piotr Owsiak


People also ask

Does nvarchar Max affect performance?

Strictly speaking the MAX types will always be a bit slower than the non-MAX types, see Performance comparison of varchar(max) vs. varchar(N). But this difference is never visible in practice, where it just becomes noise in the overall performance driven by IO. Your main concern should not be performance of MAX vs.

What is the limit of nvarchar Max in SQL Server?

n defines the string size in byte-pairs and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^30-1 characters (2 GB). The storage size is two times n bytes + 2 bytes.

What is the difference between varchar Max and nvarchar Max?

The key difference between varchar and nvarchar is the way they are stored, varchar is stored as regular 8-bit data(1 byte per character) and nvarchar stores data at 2 bytes per character. Due to this reason, nvarchar can hold upto 4000 characters and it takes double the space as SQL varchar.

What is the difference between nvarchar 50 and nvarchar Max?

nvarchar max is for columns up to 2GB. So essentially it takes up more resources. You are better off using the nvarchar(50) if you know you aren't going to need that much space. each character is about 2 bytes so with 2 GB thats 1 billion characters...


1 Answers

If you do have XML, and you can be sure it's always XML, I would definitely recommend going that way. SQL Server stores XML in an optimized format - you don't even need any XML indices on that to benefit from it.

If you insert 5000 rows of a 5KB XML into an XML column, you get roughly 1250 pages = 9 MB. Inserting the same 5000 rows with the same 5KB XML into NVARCHAR(MAX) uses over 3700 pages or 29 MB - quite a difference!

And that difference should be even more pronounced if you can associate your XML with a XML schema stored in SQL Server. Plus you're also guaranteed that the XML stored conforms to a schema - can be very helpful at times! Can't do that with a plain NVARCHAR(MAX) column...

And I don't agree that using XML over NVARCHAR(MAX) has any performance penalty - quite the contrary. Since you're potentially retrieving less data from SQL Server when you're about to display or fetch the content, I would argue it's even a tad faster than NVARCHAR(MAX).

like image 57
marc_s Avatar answered Oct 04 '22 03:10

marc_s