Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert string into xml and insert Sql Server

We have a SQL Server 2008 R2 database table with XML stored in a column of VARCHAR data type.

I now have to fetch some of the elements of the xml.

So I want to first convert the xml stored as a VARCHAR data type, to an xml stored as xml data type.

Example :

Table A

Id(int) , ProductXML (varchar(max))

Table B

Id(int), ProductXML(XML)

I want to convert the ProductXML from Table A into XML data type and insert into Table B.

I tried using the CAST() and CONVERT() function as shown below :

insert into TableB (ProductXML)
select CAST(ProductXML as XML) from TableA;

Similarly tried convert but I get an error

XML Parsing : unable to switch encoding

Is there any way I can convert the varchar entries in the table into XML entries ?

About the XML: it is huge with many nodes, and its structure changes dynamically.

Example : One row can have and XML entry for 1 product and another row can have an xml entry for multiple products.

like image 286
CodeNinja Avatar asked Apr 11 '13 15:04

CodeNinja


2 Answers

Give us a sample of your XML as all these would work:

CONVERT(XML, '<root><child/></root>')
CONVERT(XML, '<root>          <child/>         </root>', 1)
CAST('<Name><FName>Carol</FName><LName>Elliot</LName></Name>'  AS XML)

Also you might have to cast it to nvarchar or varbinary first (from Microsoft documentation):

You can parse any of the SQL Server string data types, such as [n][var]char, [n]text, varbinary,and image, into the xml data type by casting (CAST) or converting (CONVERT) the string to the xml data type. Untyped XML is checked to confirm that it is well formed. If there is a schema associated with the xml type, validation is also performed. For more information, see Compare Typed XML to Untyped XML.

XML documents can be encoded with different encodings (for example, UTF-8, UTF-16, windows-1252). The following outlines the rules on how the string and binary source types interact with the XML document encoding and how the parser behaves.

Since nvarchar assumes a two-byte unicode encoding such as UTF-16 or UCS-2, the XML parser will treat the string value as a two-byte Unicode encoded XML document or fragment. This means that the XML document needs to be encoded in a two-byte Unicode encoding as well to be compatible with the source data type. A UTF-16 encoded XML document can have a UTF-16 byte order mark (BOM), but it does not need to, since the context of the source type makes it clear that it can only be a two-byte Unicode encoded document.

The content of a varchar string is treated as a one-byte encoded XML document/fragment by the XML parser. Since the varchar source string has a code page associated, the parser will use that code page for the encoding if no explicit encoding is specified in the XML itself If an XML instance has a BOM or an encoding declaration, the BOM or declaration needs to be consistent with the code page, otherwise the parser will report an error.

The content of varbinary is treated as a codepoint stream that is passed directly to the XML parser. Thus, the XML document or fragment needs to provide the BOM or other encoding information inline. The parser will only look at the stream to determine the encoding. This means that UTF-16 encoded XML needs to provide the UTF-16 BOM and an instance without BOM and without a declaration encoding will be interpreted as UTF-8.

If the encoding of the XML document is not known in advance and the data is passed as string or binary data instead of XML data before casting to XML, it is recommended to treat the data as varbinary. For example, when reading data from an XML file using OpenRowset(), one should specify the data to be read as a varbinary(max) value:

select CAST(x as XML) 
from OpenRowset(BULK 'filename.xml', SINGLE_BLOB) R(x)

SQL Server internally represents XML in an efficient binary representation that uses UTF-16 encoding. User-provided encoding is not preserved, but is considered during the parse process.

Solution:

CONVERT(XML, CONVERT(NVARCHAR(max), ProductXML))
like image 76
Darek Avatar answered Oct 29 '22 22:10

Darek


This worked for me:

select CAST(REPLACE(CAST(column3 AS NVARCHAR(MAX)),'utf-8','utf-16') AS XML) from table
like image 31
DubMan Avatar answered Oct 29 '22 23:10

DubMan