Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it "bad" to store XML in a database?

Tags:

I have heard from several sources that storing XML in a database is "bad", but I have never seen/heard an actual explanation of why that is. Is it true? If it is true, can you explain why? Moreover, can you tell me what a "good" case for storing XML in a database is?

like image 488
Ender Avatar asked Dec 09 '10 21:12

Ender


People also ask

Can you store XML in database?

The need for saving XML files to a database is quite common since XML data can be found everywhere; and DataDirect XQuery has an easy and highly scalable solution for doing just that.

Where should I store XML files?

XML documents you insert into columns of type XML can reside either in the default storage object, or directly in the base table row. Base table row storage is under your control and is available only for small documents; larger documents are always stored in the default storage object.

Can we store XML in SQL?

In SQL Server, you usually store XML data in a column configured with the xml data type. The data type supports several methods that let you query and modify individual elements, attributes, and their values directly within the XML instance, rather than having to work with that instance as a whole.

Under what circumstances would it be prudent to use XML instead of a database?

XML is the "lingua franca" of the Web -- just about everyone can read and interpret it, unlike a database file. 2) When your data volume is small and you don't have to do complex queries against it. XML files are good for things like storing configuration or document templates.


2 Answers

There are some really stupid answers here - just because a database supports a data type does not mean you should be using it. These things are invariably added in as features because the competition have them, not because they are the right thing to do. Global variables? Triggers? Would anyone like to defend them too just because you can use them and they're there?

If you have multiple attributes, the best way to handle them in a relational database is with a one to many relationship. Parse out your useful data from the XML overhead. You then just store the ID (primary key) of the parent record with each of the rows stored in a second table, one row per attribute. You can have any number of attributes per parent record. It's database design 101, nothing clever. Storing it as unstructured XML just to store a variable number of attributes is not the way to go, it's a sledgehammer to crack a peanut. A one to many relationship between two tables is simpler, easier to understand, much faster to query, much less effort coding, and less storage (which means faster queries). Everyone wins, apart from the storage vendors.

XML is a data transfer protocol; as GolezTrol rightly said, "It is a way to export (and import) data" - i.e.: it is simply an overhead used to facilitate the communication of the structure of the data between different systems. Once received, the tags should be stripped out and the data (and only the data) stored in your database engine of choice, whatever that might be. Not the XML itself. The overhead for XML is ~10x that of the data it's describing. Want to tell your boss why that 100GB of data is occupying 1TB of space on your hyper expensive SAN? Or taking all night to back up over a saturated network link? Or causing performance problems in production? If you don't parse out the data from the now pointless tags, you will just push the problem and ongoing, daily support costs onto operational support for the next ten years. Sloppy, sloppy, sloppy. This keeps vendors like EMC in business.

XML is metadata. Nothing clever, just a schema descriptor. Once it's transferred and parsed it's lost its usefulness and is just clutter that clogs up whatever database you use. Get rid of it, unless you're compulsively addicted to hording yesterday's pointless crappy description metadata, stored many times over. Wake up. It's typical "Emperor's New Clothes" syndrome, stopped being conned by something simple and disposable. It's only metadata and it should not be stored or worshipped, it's junk once it's parsed. And what's better? To parse it once, or to uselessly parse it every time you need data from it? The answer's pretty darned obvious to me.

like image 87
Jon Avatar answered Sep 23 '22 17:09

Jon


It's not bad at all. Microsoft SQL Server has an XML data type. One use case for storing XML is a situation we found ourselves in. For each row in a particular table, we needed to store a variable number of attributes related to that row. And the number of these attributes can change over time, and with each row. We found it more efficient to store these attributes, and their values in an XML format. In the future, each time we adjust the number of attributes, we don't need to make schema changes.

like image 37
Randy Minder Avatar answered Sep 23 '22 17:09

Randy Minder