Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling empty XML nodes when converting to JSON using Jackson

I read in an XML file (provided by another system, so I cannot control it) in order to convert it to JSON. Using Jackson. I am seeing some undesirable behavior where any "empty" nodes in the source XML file are being converted to JSON with "\n <many spaces if source is indented>" as the content. For example:

Generated output:

{"a":"Dummy Content","b":"\n "}

Desired output:

{"a":"Dummy Content","b":""}

What is the most acceptable way to correct this in a generic enough way that it will work on any XML file with any empty XML nodes?

When loading the file, I tried iterating each line to clear it up like this:

String content = "";
try (BufferedReader br = new BufferedReader(new FileReader("MyFile.xml"))) {
    String line;
    while ((line = br.readLine()) != null) {
        content += line.replace(System.getProperty("line.separator"), "").trim();
    }
}

It appears to work however I was wondering if there is a better solution? The source XML files could be quite large (hundreds of thousands of lines).

Sample code that illustrates the issue

private static String testXML
    = "<Root>\n"
    + " <a>Dummy Content</a>\n"
    + " <b>\n"
    + " </b>\n"
    + "</Root>";

public static void main(String[] args) {
    XmlMapper xmlMapper = new XmlMapper();
    JsonNode jsonNode = null;
    try {
        jsonNode = xmlMapper.readTree(testXML);
    } catch (IOException ex) {
        System.out.println(ex);
    }
    System.out.println(jsonNode);
}

Generated Output:

{"a":"Dummy Content","b":"\n "}

Desired output:

{"a":"Dummy Content","b":""}
like image 775
user3689706 Avatar asked Mar 06 '26 00:03

user3689706


2 Answers

If you deserialise XML to JsonNode you can override JsonNodeFactory which creates nodes with data. For String we need to override textNode method and in case value is blank, just trim it to empty String.

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.JsonNodeFactory;
import com.fasterxml.jackson.databind.node.TextNode;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import org.apache.commons.lang3.StringUtils;

public class XmlApp {

  public static void main(String[] args) throws Exception {
    String testXML = "<Root>\n <a>Dummy Content</a>\n <b>\n </b>\n</Root>";

    XmlMapper xmlMapper = new XmlMapper();
    xmlMapper.setNodeFactory(new TrimStringTextJsonNodeFactory());

    JsonNode jsonNode = xmlMapper.readTree(testXML);

    System.out.println(jsonNode);
  }
}

class TrimStringTextJsonNodeFactory extends JsonNodeFactory {

  @Override
  public TextNode textNode(String text) {
    if (StringUtils.isBlank(text)) {
      text = StringUtils.trimToEmpty(text);
    }
    return super.textNode(text);
  }
}

Above code prints:

{"a":"Dummy Content","b":""}
like image 121
Michał Ziober Avatar answered Mar 07 '26 14:03

Michał Ziober


You could replace all special characters to empty one first:

testXml = testXml.replaceAll('\n', '');
like image 37
aholake Avatar answered Mar 07 '26 14:03

aholake



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!