Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server 2008 - Add XML Declaration to XML Output

I've been battling with this one for a few days now, I'm looking to automate an XML output with the below syntax

 SELECT (
   SELECT CONVERT(VARCHAR(10),GETDATE(),103)
   FOR XML PATH('DataVersion'), 
     TYPE
   ),
   (  
   SELECT CoNum,
     CoName,
     CONVERT(VARCHAR(10),AccountToDate,103) 'DLA',
     LAFileNet
   FROM @XMLOutput  
   FOR XML PATH('Company'),
     TYPE  
   )
 FOR XML PATH(''),
   ROOT('Companies')

Which creates the below output

<Companies>
  <DataVersion>15/11/2010</DataVersion>
  <Company>
    <CoNum>111</CoNum>
    <CoName>ABCLmt</CoName>
    <DLA>12/12/2010</DLA>
    <LAFileNet>1234</LAFileNet>
  </Company>
  <Company>
    <CoNum>222</CoNum>
    <CoName>DEFLmt</CoName>
    <DLA>12/12/2007</DLA>
    <LAFileNet>5678</LAFileNet>
  </Company>
</Companies>

What I'm struggling with is how to add the XML declaration <?xml version="1.0" encoding="ISO-8859-1" ?> to the top of the output??

Update 1: Would I be correct in thinking I need to create a XML schema in SQL server to define the xsl:output. Then assign the output to that schema?

Update 2: Have since found these links http://forums.asp.net/t/1455808.aspx -- Check out comment from Jian Kang. Also http://www.devnewsgroups.net/group/microsoft.public.sqlserver.xml/topic60022.aspx

like image 745
Pixelated Avatar asked Nov 15 '10 12:11

Pixelated


2 Answers

TL;DR

Concatenate this: <?xml version="1.0" encoding="windows-1252" ?> with your XML, converted to varchar(max).

Details

I agree with j0N45 that the schema will not change anything. As the answer he references points out:

You have to add it manually.

I provided some example code to do so in another answer. Basically, you CONVERT the XML into varchar or nvarchar and then concatenate it with the XML declaration, such as <?xml version="1.0" encoding="windows-1252" ?>.

However, it's important to choose the right encoding. SQL Server produces non-Unicode strings according to its collation settings. By default, that will be governed by the database collation settings, which you can determine using this SQL:

SELECT DATABASEPROPERTYEX('ExampleDatabaseName', 'Collation');

A common default collation is "SQL_Latin1_General_CP1_CI_AS", which has a code page of 1252. You can retrieve the code page with this SQL:

SELECT COLLATIONPROPERTY('SQL_Latin1_General_CP1_CI_AS', 'CodePage') AS 'CodePage';

For code page 1252, you should use an encoding name of "windows-1252". The use of "ISO-8859-1" is inaccurate. You can test that using the "bullet" character: •. It has a Unicode Code Point value of 8226 (Hex 2022). You can generate the character in SQL reliably, regardless of collation, using this code:

SELECT NCHAR(8226);

It has also has a code point of 149 in the windows-1252 code page, so you if you are using the common, default collation of "SQL_Latin1_General_CP1_CI_AS", then you can also produce it using:

SELECT CHAR(149);

However, CHAR(149) won't be a bullet in all collations. For example, if you try this:

SELECT CONVERT(char(1),char(149)) COLLATE Chinese_Hong_Kong_Stroke_90_BIN;

You don't get a bullet at all.

The "ISO-8859-1" code page is Windows-28591. None of the SQL Server collations (in 2005 anyway) use that code page. You can get a full list of code pages using:

SELECT [Name], [Description], [CodePage] = COLLATIONPROPERTY([Name], 'CodePage')
FROM ::fn_helpcollations()
ORDER BY [CodePage] DESC;

You can further verify that "ISO-8859-1" is the wrong choice by trying to use it in SQL itself. The following SQL:

SELECT CONVERT(xml,'<?xml version="1.0" encoding="ISO-8859-1"?><test>•</test>');

Will produce XML which does not contain a bullet. Indeed, it won't produce any character, because ISO-8859-1 has no character defined for code point 149.

SQL Server handles Unicode strings differently. With Unicode strings (nvarchar), "there is no need for different code pages to handle different sets of characters". However, SQL Server does NOT use "UTF-8" encoding. If you try to use it within SQL itself:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UTF-8"?><test>•</test>');

You will get an error:

Msg 9402, Level 16, State 1, Line 1 XML parsing: line 1, character 38, unable to switch the encoding

Rather, SQL uses "UCS-2" encoding, so this will work:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UCS-2"?><test>•</test>');
like image 107
Riley Major Avatar answered Nov 06 '22 00:11

Riley Major


I think this answers to your question How to add xml encoding <?xml version="1.0" encoding="UTF-8"?> to xml Output in SQL Server.

I don't think creating a schema would change anything, because it is only used to validation.

Cheers

like image 36
j0N45 Avatar answered Nov 06 '22 00:11

j0N45