Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to remove empty tags in input xml

Tags:

java

xml

jaxb

My java module gets a huge input xml from a mainframe. Unfortunately, the mainframe is unable to skip optional elements, with the result that I get a LOT of empty tags in my input :

So,

<SSN>111111111</SSN>
<Employment>
<Current>
<Address>
<line1/>
<line2/>
<line3/>
<city/>
<state/>
<country/>
</Address>
<Phone>
<phonenumber/>
<countryCode/>
</Phone>
</Current>
<Previous>
<Address>
<line1/>
<line2/>
<line3/>
<city/>
<state/>
<country/>    
</Address>
<Phone>
<phonenumber/>
<countryCode/>
</Phone>
</Previous>
</Employment>
<MaritalStatus>Single</MaritalStatus>

should be:

<SSN>111111111</SSN>
<MaritalStatus>SINGLE</MaritalStatus>

I use jaxb to unmarshall the input xml string that the mainframe sends it. Is there a clean/ easy way to remove all the empty group tags, or do I have to do this manuall in the code for each element. I have over 350 elements in my input xml, so I would love to it if jaxb itself had a way of doing this automatically?

Thanks, SGB

like image 886
SGB Avatar asked May 21 '10 17:05

SGB


2 Answers

You could preprocess using XSLT. I know it's considered a bit "Disco" nowadays, but it is fast and easy to apply.

From this tek-tips discussion, you could transform with XSLT to remove empty elements.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*|node()">
    <xsl:if test=". != '' or ./@* != ''">
      <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>
like image 70
blissapp Avatar answered Oct 20 '22 22:10

blissapp


I think you'd have to edit your mainframe code for the best solution. When your mainframe generates the XML, you'll have to tell it not to output a tag if it's empty.

There's not much you can do on the client side I don't think. If the XML that you get is filled with empty tags, then you have no choice but to parse them all--after all, how can you tell if a tag is empty without parsing it in some way!

But maybe you could do a regex string replace on the XML text before JAX-B gets to it:

String xml = //get the XML
xml = xml.replaceAll("<.*?/>", "");

This will remove empty tags like "<city/>" but not "<Address></Address>".

like image 32
Michael Avatar answered Oct 20 '22 23:10

Michael