I have this java program where I transform with TransformerFactory a XML string that I get from a SQL Server database and write it to a file, and then use this file to generate a PDF.
The thing is that it works fine when I execute it with netbeans, but if I execute the jar in the project dist folder I get a "Invalid byte 2 of 4-byte UTF-8 sequence".
After changing the encoding of the XML string to UTF-8 now it works fine from the jar too.
So my question is, why would it work when running the project in NetBeans but not from the JAR file before changing the encoding?
Have tried this only in Windows.
Code:
Here is the SQL Server query (original):
SQLXML xml = null;
String xmlString = "";
while (rs.next()){
xml = rs.getSQLXML(1);
xmlString = xml.getString();
}
return xmlString;
...and modified:
SQLXML xml = null;
String xmlString = "";
while (rs.next()){
xml = rs.getSQLXML(1);
// Note explicit UTF-8 encoding specified
xmlString = new String(xml.getString().getBytes(),"UTF8");
}
return xmlString;
And here the transformation:
public static void serialize(Document doc, OutputStream out) throws Exception {
TransformerFactory tfactory = TransformerFactory.newInstance();
try {
Transformer serializer = tfactory.newTransformer();
serializer.setOutputProperty("indent", "yes");
serializer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
serializer.transform(new DOMSource(doc), new StreamResult(out));
} catch (TransformerException e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
I've tried a simple Application in Netbeans that displays the Charset.defaultCharset(), and it returns "UTF-8". The same one in Eclipse returns "MacRoman". I'm on a Mac, on Windows it'd return "cp-1252".
So yes, when you run an Application in Netbeans, it defaults to UTF-8 encoding, that's why you didn't have any issues when parsing the XML.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With