reading the documentation for java org.w3c.dom.ls it seems as a Element only can be serialized to a String with the java native string encoding, UTF-16. I need however to create a UTF-8 string, escaped or what not, I understand that it still will be a UTF-16 String. Anyone has an idea to get around this? I need the string to pass in to a generated WS client that will consume the String, then it should be UTF-8.
the code i use to create the string:
DOMImplementationRegistry domImplementationRegistry = DOMImplementationRegistry.
DOMImplementationLS domImplementationLS = (DOMImplementationLS) REGISTRY.getDOMImplementation("LS");
LSSerializer writer = domImplementationLS.createLSSerializer();
String result = writer.writeToString(element);
You can still use DOMImplementationLS
:
DOMImplementationRegistry domImplementationRegistry = DOMImplementationRegistry.
DOMImplementationLS domImplementationLS = (DOMImplementationLS)REGISTRY.getDOMImplementation("LS");
LSOutput lsOutput = domImplementationLS.createLSOutput();
lsOutput.setEncoding("UTF-8");
Writer stringWriter = new StringWriter();
lsOutput.setCharacterStream(stringWriter);
lsSerializer.write(doc, lsOutput);
String result = stringWriter.toString();
I find that the most flexible way of serializing a DOM to String is to use the javax.xml.transform
API:
Node node = ...
StringWriter output = new StringWriter();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(node), new StreamResult(output));
String xml = output.toString();
It's not especially elegant, but it should give you better control over output encoding.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With