In Delphi XE2, I'm doing a xslt transform on a received XML file to remove all namespace information.
Problem: It changes
<?xml version="1.0" encoding="utf-8"?>
into
<?xml version="1.0" encoding="utf-16"?>
This is the XML that I get back from Exchange server:
<?xml version="1.0" encoding="utf-8"?>
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Header>
<h:ServerVersionInfo MajorVersion="14" MinorVersion="0" MajorBuildNumber="722" MinorBuildNumber="0" Version="Exchange2010" xmlns:h="http://schemas.microsoft.com/exchange/services/2006/types" xmlns="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"/>
</s:Header>
<s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<m:ResolveNamesResponse xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
<m:ResponseMessages>
<m:ResolveNamesResponseMessage ResponseClass="Success">
<m:ResponseCode>NoError</m:ResponseCode>
<m:ResolutionSet TotalItemsInView="1" IncludesLastItemInRange="true">
<t:Resolution>
<t:Mailbox>
<t:Name>developer</t:Name>
<t:EmailAddress>[email protected]</t:EmailAddress>
<t:RoutingType>SMTP</t:RoutingType>
<t:MailboxType>Mailbox</t:MailboxType>
</t:Mailbox>
<t:Contact>
<t:Culture>nl-NL</t:Culture>
<t:DisplayName>developer</t:DisplayName>
<t:GivenName>developer</t:GivenName>
<t:EmailAddresses>
<t:Entry Key="EmailAddress1">SMTP:[email protected]</t:Entry>
</t:EmailAddresses>
<t:ContactSource>ActiveDirectory</t:ContactSource>
</t:Contact>
</t:Resolution>
</m:ResolutionSet>
</m:ResolveNamesResponseMessage>
</m:ResponseMessages>
</m:ResolveNamesResponse>
</s:Body>
</s:Envelope>
This is the function that removes the namespace info:
Uses
MSXML2_TLB; // IXMLDOMdocument
class function TXMLHelper.RemoveNameSpaces(XMLString: String): String;
const
// An XSLT script for removing the namespaces from any document.
// From http://wiki.tei-c.org/index.php/Remove-Namespaces.xsl
cRemoveNSTransform =
'<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">' +
'<xsl:output method="xml" indent="no"/>' +
'<xsl:template match="/|comment()|processing-instruction()">' +
' <xsl:copy>' +
' <xsl:apply-templates/>' +
' </xsl:copy>' +
'</xsl:template>' +
'<xsl:template match="*">' +
' <xsl:element name="{local-name()}">' +
' <xsl:apply-templates select="@*|node()"/>' +
' </xsl:element>' +
'</xsl:template>' +
'<xsl:template match="@*">' +
' <xsl:attribute name="{local-name()}">' +
' <xsl:value-of select="."/>' +
' </xsl:attribute>' +
'</xsl:template>' +
'</xsl:stylesheet>';
var
Doc, XSL: IXMLDOMdocument2;
begin
Doc := ComsDOMDocument.Create;
Doc.ASync := false;
XSL := ComsDOMDocument.Create;
XSL.ASync := false;
try
Doc.loadXML(XMLString);
XSL.loadXML(cRemoveNSTransform);
Result := Doc.TransFormNode(XSL);
except
on E:Exception do Result := E.Message;
end;
end; { RemoveNameSpaces }
But after this, it's suddenly a utf-16 document:
<?xml version="1.0" encoding="UTF-16"?>
<Envelope>
[snip]
</Envelope>
After Googling "xsl utf-8 utf-16" I tried several things:
Change the line (e.g. Output DataTable XML in UTF8 rather than UTF16)
<xsl:output method="xml" indent="no">
into either:
<xsl:output method="xml" encoding="utf-8" indent="no"/>
<xsl:output method="xml" encoding="utf-8"/>
<xsl:output encoding="utf-8"/>
That did not work.
(It would be the optimal solution, according to http://www.xml.com/pub/a/2002/09/04/xslt.html "The encoding attribute actually does more than add an encoding declaration to the result document; it tells the XSLT processor to write out the result using that encoding.")
Change the line (e.g. XslCompiledTransform uses UTF-16 encoding)
<xsl:output method="xml" indent="no"/>
into
<xsl:output method="xml" omit-xml-declaration="yes" indent="no" />
which leaves out the starting xml tag, but if I then just prepend
<?xml version="1.0" encoding="utf-8"?>
I will lose characters because no actual utf conversion is done.
IXMLDOMdocument2 does not have an Encoding
property
Any ideas how to fix this?
Remarks/background:
If all else fails there's maybe still the option to change the utf-16 XML data to utf-8, but that's an entirely different approach.
I'm trying to do everything utf-8 because I'm communicating with Exchange server through EWS, and setting the http request header to utf-16 does not work: Exchange tells me that the content-type 'text/xml; charset = utf-16' is not the expected type 'text/xml; charset = utf-8'. EWS returns utf-8 (see start of post).
XSLT is used to transform XML document from one form to another form. XSLT uses Xpath to perform matching of nodes to perform these transformation . The result of applying XSLT to XML document could be an another XML document, HTML, text or any another document from technology perspective.
XSLT is commonly used to convert XML to HTML, but can also be used to transform XML documents that comply with one XML schema into documents that comply with another schema. XSLT can also be used to convert XML data into unrelated formats, like comma-delimited text or formatting languages such as troff.
XSLT uses the <xsl:output> element to determine whether the output produced by the transformation is conformant XML (<xsl:output method="xml"/> ), valid HTML (<xsl:output method="html"/> ), or unverified text (< xsl:output method="text"/> ).
The problem is the use of the transformNode
method, it returns a string and with MSXML such a string is UTF-16 encoded. So you need to create an empty MSXML DOM document for the result and use the transformNodeToObject
method, passing the empty DOM document as the second argument, then you can save the result document to a file or stream and the encoding should be as specified in the xsl:output
directive.
To use IXMLDocument in you original code, it should look like this:
var
iInp, iOtp, iXsl: IXMLDocument;
Utf8: UTF8String;
begin
iInp := LoadXMLData(XMLString);
iXsl := LoadXMLData(cRemoveNSTransfrom);
iOtp := NewXMLDocument;
iInp.Node.TransformNode(iXsl.Node,iOtp);
iOtp.SaveToXML(Utf8);
end
Now the variable Utf8 should contain transformed XML in UTF-8 encoding, If you want save to stream/file, replace SaveToXML by
iOtp.Encoding := 'UTF-8';
iOtp.SaveToFile(....);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With