I have a Stored Procedure that takes an XML document as a parameter similar in a structure to the following: <pre class="prettyprint"><code><grandparent name="grandpa bob"> <parent name="papa john"> <children> <child name="mark" /> <child name="cindy" /> </children> </parent> <parent name="papa henry"> <children> <child name="mary" /> </children> </parent> </grandparent> </code></pre> My requirement is to "flatten" this data so that it can be inserted into a temporary table and manipulated further down the procedure, so the above XML becomes: <pre class="prettyprint"><code>Grandparent Name Parent Name Child Name ---------------- --------------- --------------- grandpa bob papa john mark grandpa bob papa john cindy grandpa bob papa henry mary </code></pre> This is currently being done using SQL Server XML Nodes: <pre class="prettyprint"><code>SELECT VIRT.node.value('../../../@name','varchar(15)') 'Grandparent Name', VIRT.node.value('../../@name','varchar(15)') 'Parent Name', VIRT.node.value('@name','varchar(15)') 'Child Name' FROM @xmlFamilyTree.nodes('/grandparent/parent/children/child') AS VIRT(node) </code></pre> This works great until I start throwing large amounts of data at the procedure (i.e. 1000+ <code>child</code> nodes), at which point this grinds to a halt and takes between 1 and 2 minutes to execute. I think this may be due to the fact that I am starting off at the lowest level (<code><child</code>) and then traversing back up the XML document for each occurance. Would splitting this single query into 3 chunks (one per node that I need data from) improve performance here? Given that none of these nodes have "keys" on them that I could use to join back up with, could anyone offer any pointers how I might be able to go about doing this?

I seem to have answered my own question after a bit more looking around online: <pre class="prettyprint"><code>SELECT grandparent.gname.value('@name', 'VARCHAR(15)'), parent.pname.value('@name', 'VARCHAR(15)'), child.cname.value('@name', 'VARCHAR(15)') FROM @xmlFamilyTree.nodes('/grandparent') AS grandparent(gname) CROSS APPLY grandparent.gname.nodes('*') AS parent(pname) CROSS APPLY parent.pname.nodes('children/*') AS child(cname) </code></pre> Using <code>CROSS APPLY</code> I can select the top-level <code>grandparent</code> node and use this to select the child <code>parent</code> nodes and so on. Using this method I have taken my query from executing in around 1 minute 30 seconds down to around 6 seconds. Interestingly though, if I use the "old" <code>OPEN XML</code> method to retrieve the same data, the query executes in 1 second! It seems like you may have to approach the use of these two techniques on a case-by-case basis depending on the expected size/complexity of the document being passed in.

Flattening hierarchical XML in SQL using the nodes() method

Q: What is XML node in SQL Server?

The nodes() method is useful when you want to shred an xml data type instance into relational data. It allows you to identify nodes that will be mapped into a new row.

Q: What is XML format in SQL Server?

SQL Server provides an XML schema that defines syntax for writing XML format files to use for bulk importing data into a SQL Server table. XML format files must adhere to this schema, which is defined in the XML Schema Definition Language (XSDL).

Tags:

sql

tsql

sql-server-2005

xml

sql-server-2008

I have a Stored Procedure that takes an XML document as a parameter similar in a structure to the following:

<grandparent name="grandpa bob">
  <parent name="papa john">
    <children>
      <child name="mark" />
      <child name="cindy" />
    </children>
  </parent>
  <parent name="papa henry">
    <children>
      <child name="mary" />
    </children>
  </parent>
</grandparent>

My requirement is to "flatten" this data so that it can be inserted into a temporary table and manipulated further down the procedure, so the above XML becomes:

Grandparent Name Parent Name     Child Name
---------------- --------------- ---------------
grandpa bob      papa john       mark
grandpa bob      papa john       cindy
grandpa bob      papa henry      mary

This is currently being done using SQL Server XML Nodes:

SELECT
    VIRT.node.value('../../../@name','varchar(15)') 'Grandparent Name',
    VIRT.node.value('../../@name','varchar(15)') 'Parent Name',
    VIRT.node.value('@name','varchar(15)') 'Child Name'
FROM
    @xmlFamilyTree.nodes('/grandparent/parent/children/child') AS VIRT(node)

This works great until I start throwing large amounts of data at the procedure (i.e. 1000+ child nodes), at which point this grinds to a halt and takes between 1 and 2 minutes to execute. I think this may be due to the fact that I am starting off at the lowest level (<child) and then traversing back up the XML document for each occurance. Would splitting this single query into 3 chunks (one per node that I need data from) improve performance here? Given that none of these nodes have "keys" on them that I could use to join back up with, could anyone offer any pointers how I might be able to go about doing this?

808

asked Apr 27 '11 07:04

Matt Weldon

1 Answers

I seem to have answered my own question after a bit more looking around online:

SELECT
    grandparent.gname.value('@name', 'VARCHAR(15)'),
    parent.pname.value('@name', 'VARCHAR(15)'),
    child.cname.value('@name', 'VARCHAR(15)')
FROM
    @xmlFamilyTree.nodes('/grandparent') AS grandparent(gname)
CROSS APPLY
    grandparent.gname.nodes('*') AS parent(pname)
CROSS APPLY
    parent.pname.nodes('children/*') AS child(cname)

Using CROSS APPLY I can select the top-level grandparent node and use this to select the child parent nodes and so on. Using this method I have taken my query from executing in around 1 minute 30 seconds down to around 6 seconds.

Interestingly though, if I use the "old" OPEN XML method to retrieve the same data, the query executes in 1 second!

It seems like you may have to approach the use of these two techniques on a case-by-case basis depending on the expected size/complexity of the document being passed in.

answered Oct 09 '22 01:10

Matt Weldon

Related questions
                            
                                Deleting Duplicate Records in Oracle based on Maximum Date/Time
                            
                                SQL BACKUP Query
                            
                                How do I replace a pattern using T-SQL?
                            
                                A question regarding C# and SQL [closed]
                            
                                Design Pattern to add columns in database table dynamically
                            
                                Can't get join on mysql delete query to work
                            
                                Indexing simple query in a huge database
                            
                                Output Clause: The multi-part identifier could not be bound
                            
                                Derby's handling of NULL values
                            
                                Oracle - Sorting a VARCHAR2 field like a NUMBER - I found a solution, need explanation on it
                            
                                SQL CONCAT with an IF statement
                            
                                Update multiple rows using one query
                            
                                SQL query to find status of the last event of an order
                            
                                Summarize days per month based on date ranges
                            
                                MySQL: Many to Many Join Where Not Exists
                            
                                Combining multiple text fields into one in MySQL
                            
                                MySQL LIKE statement inteprets "o" and "ö" as the same
                            
                                how to avoid group by on a lot of columns
                            
                                returning a result set from a java stored procedure through SQL "select * from "
                            
                                TSQL query for finding rows that match a set of properties

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Flattening hierarchical XML in SQL using the nodes() method

Tags:

sql

tsql

sql-server-2005

xml

sql-server-2008

Matt Weldon

People also ask

1 Answers

Matt Weldon

Recent Activity

Donate For Us