Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The Best Way to shred XML data into SQL Server database columns

Tags:

sql-server

xml

What is the best way to shred XML data into various database columns? So far I have mainly been using the nodes and value functions like so:

INSERT INTO some_table (column1, column2, column3) SELECT Rows.n.value('(@column1)[1]', 'varchar(20)'), Rows.n.value('(@column2)[1]', 'nvarchar(100)'), Rows.n.value('(@column3)[1]', 'int'), FROM @xml.nodes('//Rows') Rows(n) 

However I find that this is getting very slow for even moderate size xml data.

like image 243
eddiegroves Avatar asked Sep 14 '08 09:09

eddiegroves


People also ask

Can I parse XML in SQL?

First, the sp_xml_preparedocument stored procedure parses the XML document. The parsed document is a tree representation of the nodes (elements, attributes, text, and comments) in the XML document. OPENXML then refers to this parsed XML document and provides a rowset view of all or parts of this XML document.

How do I get data from XML format in SQL Server?

SQL Server lets you retrieve data as XML by supporting the FOR XML clause, which can be included as part of your query. You can use the FOR XML clause in the main (outer) query as well as in subqueries. The clause supports numerous options that let you define the format of the XML data.

What is XML shredding?

XML shredding is the process of extracting values from XML documents and using them to update tables in the database. This is done based on a user-defined mapping specification that maps the XML structure to the target tables.


1 Answers

Stumbled across this question whilst having a very similar problem, I'd been running a query processing a 7.5MB XML file (~approx 10,000 nodes) for around 3.5~4 hours before finally giving up.

However, after a little more research I found that having typed the XML using a schema and created an XML Index (I'd bulk inserted into a table) the same query completed in ~ 0.04ms.

How's that for a performance improvement!

Code to create a schema:

IF EXISTS ( SELECT * FROM sys.xml_schema_collections where [name] = 'MyXmlSchema') DROP XML SCHEMA COLLECTION [MyXmlSchema] GO  DECLARE @MySchema XML SET @MySchema =  (     SELECT * FROM OPENROWSET     (         BULK 'C:\Path\To\Schema\MySchema.xsd', SINGLE_CLOB      ) AS xmlData )  CREATE XML SCHEMA COLLECTION [MyXmlSchema] AS @MySchema  GO 

Code to create the table with a typed XML column:

CREATE TABLE [dbo].[XmlFiles] (     [Id] [uniqueidentifier] NOT NULL,      -- Data from CV element      [Data] xml(CONTENT dbo.[MyXmlSchema]) NOT NULL,  CONSTRAINT [PK_XmlFiles] PRIMARY KEY NONCLUSTERED  (     [Id] ASC )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY] ) ON [PRIMARY] 

Code to create Index

CREATE PRIMARY XML INDEX PXML_Data ON [dbo].[XmlFiles] (Data) 

There are a few things to bear in mind though. SQL Server's implementation of Schema doesn't support xsd:include. This means that if you have a schema which references other schema, you'll have to copy all of these into a single schema and add that.

Also I would get an error:

XQuery [dbo.XmlFiles.Data.value()]: Cannot implicitly atomize or apply 'fn:data()' to complex content elements, found type 'xs:anyType' within inferred type 'element({http://www.mynamespace.fake/schemas}:SequenceNumber,xs:anyType) ?'. 

if I tried to navigate above the node I had selected with the nodes function. E.g.

SELECT     ,C.value('CVElementId[1]', 'INT') AS [CVElementId]     ,C.value('../SequenceNumber[1]', 'INT') AS [Level] FROM      [dbo].[XmlFiles] CROSS APPLY     [Data].nodes('/CVSet/Level/CVElement') AS T(C) 

Found that the best way to handle this was to use the OUTER APPLY to in effect perform an "outer join" on the XML.

SELECT     ,C.value('CVElementId[1]', 'INT') AS [CVElementId]     ,B.value('SequenceNumber[1]', 'INT') AS [Level] FROM      [dbo].[XmlFiles] CROSS APPLY     [Data].nodes('/CVSet/Level') AS T(B) OUTER APPLY     B.nodes ('CVElement') AS S(C) 

Hope that that helps someone as that's pretty much been my day.

like image 150
Dan Avatar answered Sep 19 '22 11:09

Dan