Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I remove redundant namespace in nested query when using FOR XML PATH

UPDATE: I've discovered there is a Microsoft Connect item raised for this issue here

When using FOR XML PATH and WITH XMLNAMESPACES to declare a default namespace, I will get the namespace declaration duplicated in any top level nodes for nested queries that use FOR XML, I've stumbled across a few solutions on-line, but I'm not totally convinced...

Here's an Complete Example

/*
drop table t1
drop table t2
*/
create table t1 ( c1 int, c2 varchar(50))
create table t2 ( c1 int, c2 int, c3 varchar(50))
insert t1 values 
(1, 'Mouse'),
(2, 'Chicken'),
(3, 'Snake');
insert t2 values
(1, 1, 'Front Right'),
(2, 1, 'Front Left'),
(3, 1, 'Back Right'),
(4, 1, 'Back Left'),
(5, 2, 'Right'),
(6, 2, 'Left')



;with XmlNamespaces( default 'uri:animal')
select 
    a.c2 as "@species"
    , (select l.c3 as "text()" 
       from t2 l where l.c2 = a.c1 
       for xml path('leg'), type) as "legs"
from t1 a
for xml path('animal'), root('zoo')

What's the best solution?

like image 281
Dog Ears Avatar asked Jul 13 '10 22:07

Dog Ears


4 Answers

After hours of desperation and hundreds of trials & errors, I've come up with the solution below.

I had the same issue, when I wanted just one xmlns attribute, on the root node only. But I also had a very difficult query with lot's of subqueries and FOR XML EXPLICIT method alone was just too cumbersome. So yes, I wanted the convenience of FOR XML PATH in the subqueries and also to set my own xmlns.

I kindly borrowed the code of 8kb's answer, because it was so nice. I tweaked it a bit for better understanding. Here is the code:

DECLARE @Order TABLE (OrderID INT, OrderDate DATETIME)    
DECLARE @OrderDetail TABLE (OrderID INT, ItemID VARCHAR(1), Name VARCHAR(50), Qty INT)    
INSERT @Order VALUES (1, '2010-01-01'), (2, '2010-01-02')    
INSERT @OrderDetail VALUES (1, 'A', 'Drink',  5),
                           (1, 'B', 'Cup',    2),
                           (2, 'A', 'Drink',  2),
                           (2, 'C', 'Straw',  1),
                           (2, 'D', 'Napkin', 1)

-- Your ordinary FOR XML PATH query
DECLARE @xml XML = (SELECT OrderID AS "@OrderID",
                        (SELECT ItemID AS "@ItemID", 
                                Name AS "data()" 
                         FROM @OrderDetail 
                         WHERE OrderID = o.OrderID 
                         FOR XML PATH ('Item'), TYPE)
                    FROM @Order o 
                    FOR XML PATH ('Order'), ROOT('dummyTag'), TYPE)

-- Magic happens here!       
SELECT 1 AS Tag
      ,NULL AS Parent
      ,@xml AS [xml!1!!xmltext]
      ,'http://test.com/order' AS [xml!1!xmlns]
FOR XML EXPLICIT

Result:

<xml xmlns="http://test.com/order">
  <Order OrderID="1">
    <Item ItemID="A">Drink</Item>
    <Item ItemID="B">Cup</Item>
  </Order>
  <Order OrderID="2">
    <Item ItemID="A">Drink</Item>
    <Item ItemID="C">Straw</Item>
    <Item ItemID="D">Napkin</Item>
  </Order>
</xml>

If you selected @xml alone, you would see that it contains root node dummyTag. We don't need it, so we remove it by using directive xmltext in FOR XML EXPLICIT query:

,@xml AS [xml!1!!xmltext]

Although the explanation in MSDN sounds more sophisticated, but practically it tells the parser to select the contents of XML root node.

Not sure how fast the query is, yet currently I am relaxing and drinking Scotch like a gent while peacefully looking at the code...

like image 183
Gabrielius Avatar answered Nov 05 '22 04:11

Gabrielius


If I have understood correctly, you are referring to the behavior that you might see in a query like this:

DECLARE @Order TABLE (
  OrderID INT, 
  OrderDate DATETIME)

DECLARE @OrderDetail TABLE (
  OrderID INT, 
  ItemID VARCHAR(1), 
  ItemName VARCHAR(50), 
  Qty INT)

INSERT @Order 
VALUES 
(1, '2010-01-01'),
(2, '2010-01-02')

INSERT @OrderDetail 
VALUES 
(1, 'A', 'Drink',  5),
(1, 'B', 'Cup',    2),
(2, 'A', 'Drink',  2),
(2, 'C', 'Straw',  1),
(2, 'D', 'Napkin', 1)

;WITH XMLNAMESPACES('http://test.com/order' AS od) 
SELECT
  OrderID AS "@OrderID",
  (SELECT 
     ItemID AS "@od:ItemID", 
     ItemName AS "data()" 
   FROM @OrderDetail 
   WHERE OrderID = o.OrderID 
   FOR XML PATH ('od.Item'), TYPE)
FROM @Order o 
FOR XML PATH ('od.Order'), TYPE, ROOT('xml')

Which gives the following results:

<xml xmlns:od="http://test.com/order">
  <od.Order OrderID="1">
    <od.Item xmlns:od="http://test.com/order" od:ItemID="A">Drink</od.Item>
    <od.Item xmlns:od="http://test.com/order" od:ItemID="B">Cup</od.Item>
  </od.Order>
  <od.Order OrderID="2">
    <od.Item xmlns:od="http://test.com/order" od:ItemID="A">Drink</od.Item>
    <od.Item xmlns:od="http://test.com/order" od:ItemID="C">Straw</od.Item>
    <od.Item xmlns:od="http://test.com/order" od:ItemID="D">Napkin</od.Item>
  </od.Order>
</xml>

As you said, the namespace is repeated in the results of the subqueries.

This behavior is a feature according to a conversation on devnetnewsgroup (website now defunct) although there is the option to vote on changing it.

My proposed solution is to revert back to FOR XML EXPLICIT:

SELECT
  1 AS Tag,
  NULL AS Parent,
  'http://test.com/order' AS [xml!1!xmlns:od],
  NULL AS [od:Order!2],
  NULL AS [od:Order!2!OrderID],
  NULL AS [od:Item!3],
  NULL AS [od:Item!3!ItemID]
UNION ALL
SELECT 
  2 AS Tag,
  1 AS Parent,
  'http://test.com/order' AS [xml!1!xmlns:od],
  NULL AS [od:Order!2],
  OrderID AS [od:Order!2!OrderID],
  NULL AS [od:Item!3],
  NULL [od:Item!3!ItemID]
FROM @Order 
UNION ALL
SELECT
  3 AS Tag,
  2 AS Parent,
  'http://test.com/order' AS [xml!1!xmlns:od],
  NULL AS [od:Order!2],
  o.OrderID AS [od:Order!2!OrderID],
  d.ItemName AS [od:Item!3],
  d.ItemID AS [od:Item!3!ItemID]
FROM @Order o INNER JOIN @OrderDetail d ON o.OrderID = d.OrderID
ORDER BY [od:Order!2!OrderID], [od:Item!3!ItemID]
FOR XML EXPLICIT

And see these results:

<xml xmlns:od="http://test.com/order">
  <od:Order OrderID="1">
    <od:Item ItemID="A">Drink</od:Item>
    <od:Item ItemID="B">Cup</od:Item>
  </od:Order>
  <od:Order OrderID="2">
    <od:Item ItemID="A">Drink</od:Item>
    <od:Item ItemID="C">Straw</od:Item>
    <od:Item ItemID="D">Napkin</od:Item>
  </od:Order>
</xml>
like image 20
8kb Avatar answered Nov 05 '22 04:11

8kb


An alternative solution I've seen is to add the XMLNAMESPACES declaration after building the xml into a temporary variable:

declare @xml as xml;
select @xml = (
select 
    a.c2 as "@species"
    , (select l.c3 as "text()" 
       from t2 l where l.c2 = a.c1 
       for xml path('leg'), type) as "legs"
from t1 a
for xml path('animal'))

;with XmlNamespaces( 'uri:animal' as an)
select @xml for xml path('') , root('zoo');
like image 27
Dog Ears Avatar answered Nov 05 '22 05:11

Dog Ears


The problem here is compounded by the fact that you cannot directly declare the namespaces manually when using XML PATH. SQL Server will disallow any attribute names beginning with 'xmlns' and any tag names with colons in them.

Rather than having to resort to using the relatively unfriendly XML EXPLICIT, I got around the problem by first generating XML with 'cloaked' namespace definitions and references, then doing string replaces as follows ...

DECLARE @Order TABLE (
  OrderID INT, 
  OrderDate DATETIME)

DECLARE @OrderDetail TABLE (
  OrderID INT, 
  ItemID VARCHAR(1), 
  ItemName VARCHAR(50), 
  Qty INT)

INSERT @Order 
VALUES 
(1, '2010-01-01'),
(2, '2010-01-02')

INSERT @OrderDetail 
VALUES 
(1, 'A', 'Drink',  5),
(1, 'B', 'Cup',    2),
(2, 'A', 'Drink',  2),
(2, 'C', 'Straw',  1),
(2, 'D', 'Napkin', 1)

declare @xml xml

set @xml = (SELECT
  'http://test.com/order' as "@xxmlns..od",  -- 'Cloaked' namespace def
  (SELECT OrderID AS "@OrderID", 
    (SELECT 
      ItemID AS "@od..ItemID", 
      ItemName AS "data()" 
     FROM @OrderDetail 
     WHERE OrderID = o.OrderID 
     FOR XML PATH ('od..Item'), TYPE)
   FROM @Order o
   FOR XML PATH ('od..Order'), TYPE)
  FOR XML PATH('xml'))

set @xml = cast(replace(replace(cast(@xml as nvarchar(max)), 'xxmlns', 'xmlns'),'..',':') as xml)

select @xml

A few things to point out:

  1. I'm using 'xxmlns' as my cloaked version of 'xmlns' and '..' to stand in for ':'. This might not work for you if you're likely to have '..' as part of text values - you can substitute this with something else as long as you pick something that makes a valid XML identifier.

  2. Since we want the xmlns definition at the top level, we cannot use the 'ROOT' option to XML PATH - instead I needed to add an another outer level to the subselect structure to achieve this.

like image 1
David Claughton Avatar answered Nov 05 '22 05:11

David Claughton