Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to control encoding with Jackson XmlMapper?

I can't find a(n obvious) way to change the encoding for serialized XML from the default UTF-8 to ISO-8859-1. I read the Usage Guide, which makes me think that there must be a way using XMLOutputFactory with XmlFactory to achieve this, but I can't see a way to configure any of those factories to use another encoding by default, there's only createXMLEventWriter where I could pass in an encoding.

I know how to generate the XML declaration using ToXmlGenerator.Feature.WRITE_XML_DECLARATION. So what I need is a declaration like this:

<?xml version='1.0' encoding='ISO-8859-1'?>

And of course the content should be encoded in ISO-8859-1, too.

like image 413
Hein Blöd Avatar asked Oct 15 '25 15:10

Hein Blöd


2 Answers

In the ToXmlGenerator source code, you'll find that UTF-8 is hard coded:

if (Feature.WRITE_XML_1_1.enabledIn(_formatFeatures)) {
    _xmlWriter.writeStartDocument("UTF-8", "1.1");
} else if (Feature.WRITE_XML_DECLARATION.enabledIn(_formatFeatures)) {
    _xmlWriter.writeStartDocument("UTF-8", "1.0");
} else {
    return;
}

Once ToXmlGenerator is final there might not be an easy way to handle it. I've submitted an issue in the jackson-dataformat-xml project.


If you stick to JAXB, you can control the value of the encoding attribute using Marshaller.JAXB_ENCODING:

Marshaller marshaller = jaxbContext.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.setProperty(Marshaller.JAXB_ENCODING, "ISO-8859-1");
marshaller.marshal(foo, System.out);

See this answer.

like image 185
cassiomolin Avatar answered Oct 18 '25 10:10

cassiomolin


The solution I found is to use custom Jackson api with Writer with the encoding you want and print the xml declaration yourself.

You have to use Writer wrapper because Jackson uses reflection (I think it does) to find out what kind of writer you use and what its encoding is and depending on that (whether it is not UTF-8) performs XML entity encoding of characters over 127. If you are happy with XML entity encoding, you can skip the wrapper.

if you use Jackson's

mapper.configure(ToXmlGenerator.Feature.WRITE_XML_DECLARATION, true);

you run into a risk of creating invalid XML depending on your local environment. Jackson will always print UTF-8 in xml declaration and if you provide a stream with non utf-8 encoding (and some writer constructors don't let you specify encoding and use platform default - which might change between platforms), you can get the body of the document encoded in a different way than it's xml declaration header would make you believe.

String fileName = "/tmp/file.xml";
String encoding = "ISO-8859-1";
Writer output = new OutputStreamWriter(new FileOutputStream(fileName), encoding);

output.write("<?xml version=\"1.0\" encoding=\"" + encoding + "\" ?>\n");
mapper.writer().writeValue(new Writer(output) {
    @Override
    public void write(char[] var1, int var2, int var3) throws IOException {
        output.write(var1, var2, var3);
    }

    @Override
    public void flush() throws IOException {
        output.flush();
    }

    @Override
    public void close() throws IOException {
        output.close();
    }

}, value);
like image 26
rattaman Avatar answered Oct 18 '25 12:10

rattaman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!