Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use SQL Server 2005 XML APIs to normalize an XML fragment

I have some (untyped) XML being stored in SQL Server 2005 that I need to transform into a normalized structure. The structure of the document currently looks like so:

<wrapper>
 <parent />
 <node />
 <node />
 <node />

 <parent />
 <node />
 <node />
 <node />
<wrapper> 

I want to transform it to look like this:

<wrapper>
 <parent>
  <node />
  <node />
  <node />
 </parent>
 <parent>
  <node />
  <node />
  <node />
 </parent>
<wrapper> 

I can select the XML out into a relational structure if I need to, put the problem is there are no attributes linking the parent and the child nodes together, so order becomes an issue when using set-based operations. How can I use the .nodes()/.value()/other SQL Server XML APIs to transform this data? The transformation needs to run as part of a batch SQL script so extracting it into another tool/language is not a reasonable option for me.

like image 981
jonathan.cone Avatar asked Jan 29 '26 13:01

jonathan.cone


1 Answers

Actually - following code works (grouping here may be isn't very optimal, but anyway):

declare @xml xml = '
    <wrapper>
     <parent id="1" />
     <node id="1" />
     <node id="2" />
     <node id="3" />

     <parent id="2" />
     <node id="4" />
     <node id="5" />
     <node id="6" />
    </wrapper> 
'

;with px as
(
    select row_number() over (order by (select 1)) as RowNumber
        ,t.v.value('@id', 'int') as Id
        ,t.v.value('local-name(.)', 'nvarchar(max)') as TagName
    from @xml.nodes('//wrapper/*') as t(v)
)
select p.Id as [@id],
    (
        select n.Id as id
        from px n
        where n.TagName = 'node'
            and n.RowNumber > p.RowNumber
            and not exists
            (
                select null
                from px np
                where np.TagName = 'parent'
                    and np.RowNumber > p.RowNumber
                    and np.RowNumber < n.RowNumber
            )
        order by n.RowNumber
        for xml raw('node'), type
    )
from px p
where p.TagName = 'parent'
order by p.RowNumber
for xml path('parent'), root('wrapper')

But I don't recommend to use it. See here: http://msdn.microsoft.com/en-us/library/ms172038%28v=sql.90%29.aspx:

In SQLXML 4.0, document order is not always determined
So I'm not sure that we can rely on order of tags inside wrapper (and code above is more just for fun than for practical use).
like image 77
oryol Avatar answered Feb 01 '26 06:02

oryol



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!