Does it make more sense (except of server side validation XML/schema/dtd) to store XML in XML type instead of text/varchar/ntext? I'm not planning to do any XML manipulation on database side.
The purpose of my investigation is to decrease database size. Can I use XML data type for untyped XML for this purpose? What are the pros and cons?
I found an article related to the topic, but I am not sure if the authors assumptions/conclusions are correct.
If you store xml in an xml typed column, the data will not get stored as simple text, as in the nvarchar case, it will be stored in some sort of parsed data tree, which in turn will be smaller than the unparsed xml version. This not only decreases the database size, but gives you other advantages, like validation, easy manipulation etc. (even though you're not using any of these, still, they are there for future use).
On the other hand, the server will have to parse the data upon insertion, which will probably slow your database down - you have to make a decision of speed vs. size.
Edit:
Personally, I think that data in the database should be stored as xml only when it has structure which is hard to implement in a relational model, e.g. layouts, style descriptions etc. Usually that means that there won't be much data and speed is not an issue, thus added xml features, like data validation and manipulation ability (also, last but not least, the ability to click on the value in managment studio and see formatted xml - I really love that feature!), outweight the costs.
I don't have direct experience in storing large amounts of xml in the database and I wouldn't do that if I had the option, since it is almost always slower that a relational model, but if that would be the case, I'd recommend profiling both options, and choosing between size and speed that best suit your needs.
My quick investigation shows that MS SQL 2005 (Express Edition)
Microsoft SQL Server 2005 - 9.00.3073.00 (Intel X86) Aug 5 2008 12:31:12 Copyright (c) 1988-2005 Microsoft Corporation Express Edition on Windows NT 6.0 (Build 6000: )
store XML with overhead about 70% (possible for faster processing/parsing).
My data before conversion: rows=160320, reserved=178576 KB, data=178184 KB, index_size=272 KB, unused=120 KB
My data after the conversion: rows=160320, reserved=309702 KB, data=307216 KB, index_size=1672 KB, unused=184 KB
So it hasn't any sense to store XML data in XML data type if you are not planning use XML technology on database side.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With