Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate CDATA block using JAXB?

Tags:

java

xml

cdata

jaxb

I am using JAXB to serialize my data to XML. The class code is simple as given below. I want to produce XML that contains CDATA blocks for the value of some Args. For example, current code produces this XML:

<command>
   <args>
      <arg name="test_id">1234</arg>
      <arg name="source">&lt;html>EMAIL&lt;/html></arg>
   </args>
</command>

I want to wrap the "source" arg in CDATA such that it looks like below:

<command>
   <args>
      <arg name="test_id">1234</arg>
      <arg name="source"><[![CDATA[<html>EMAIL</html>]]></arg>
   </args>
</command>

How can I achieve this in the below code?

@XmlRootElement(name="command")
public class Command {

        @XmlElementWrapper(name="args")
        protected List<Arg>  arg;
    }
@XmlRootElement(name="arg")
public class Arg {

        @XmlAttribute
        public String name;
        @XmlValue
        public String value;

        public Arg() {};

        static Arg make(final String name, final String value) {
            Arg a = new Arg();
            a.name=name; a.value=value;
            return a; }
    }
like image 754
Shreerang Avatar asked Jun 28 '10 21:06

Shreerang


4 Answers

Note: I'm the EclipseLink JAXB (MOXy) lead and a member of the JAXB (JSR-222) expert group.

If you are using MOXy as your JAXB provider then you can leverage the @XmlCDATA extension:

package blog.cdata;

import javax.xml.bind.annotation.XmlRootElement;
import org.eclipse.persistence.oxm.annotations.XmlCDATA;

@XmlRootElement(name="c")
public class Customer {

   private String bio;

   @XmlCDATA
   public void setBio(String bio) {
      this.bio = bio;
   }

   public String getBio() {
      return bio;
   }

}

For More Information

  • http://bdoughan.blogspot.com/2010/07/cdata-cdata-run-run-data-run.html
  • http://blog.bdoughan.com/2011/05/specifying-eclipselink-moxy-as-your.html
like image 111
bdoughan Avatar answered Oct 13 '22 00:10

bdoughan


Use JAXB's Marshaller#marshal(ContentHandler) to marshal into a ContentHandler object. Simply override the characters method on the ContentHandler implementation you are using (e.g. JDOM's SAXHandler, Apache's XMLSerializer, etc):

public class CDataContentHandler extends (SAXHandler|XMLSerializer|Other...) {
    // see http://www.w3.org/TR/xml/#syntax
    private static final Pattern XML_CHARS = Pattern.compile("[<>&]");

    public void characters(char[] ch, int start, int length) throws SAXException {
        boolean useCData = XML_CHARS.matcher(new String(ch,start,length)).find();
        if (useCData) super.startCDATA();
        super.characters(ch, start, length);
        if (useCData) super.endCDATA();
    }
}

This is much better than using the XMLSerializer.setCDataElements(...) method because you don't have to hardcode any list of elements. It automatically outputs CDATA blocks only when one is required.

like image 30
a2ndrade Avatar answered Oct 13 '22 01:10

a2ndrade


Solution Review:

  • The answer of fred is just a workaround which will fail while validating the content when the Marshaller is linked to a Schema because you modify only the string literal and do not create CDATA sections. So if you only rewrite the String from foo to <![CDATA[foo]]> the length of the string is recognized by Xerces with 15 instead of 3.
  • The MOXy solution is implementation specific and does not work only with the classes of the JDK.
  • The solution with the getSerializer references to the deprecated XMLSerializer class.
  • The solution LSSerializer is just a pain.

I modified the solution of a2ndrade by using a XMLStreamWriter implementation. This solution works very well.

XMLOutputFactory xof = XMLOutputFactory.newInstance();
XMLStreamWriter streamWriter = xof.createXMLStreamWriter( System.out );
CDataXMLStreamWriter cdataStreamWriter = new CDataXMLStreamWriter( streamWriter );
marshaller.marshal( jaxbElement, cdataStreamWriter );
cdataStreamWriter.flush();
cdataStreamWriter.close();

Thats the CDataXMLStreamWriter implementation. The delegate class simply delegates all method calls to the given XMLStreamWriter implementation.

import java.util.regex.Pattern;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;

/**
 * Implementation which is able to decide to use a CDATA section for a string.
 */
public class CDataXMLStreamWriter extends DelegatingXMLStreamWriter
{
   private static final Pattern XML_CHARS = Pattern.compile( "[&<>]" );

   public CDataXMLStreamWriter( XMLStreamWriter del )
   {
      super( del );
   }

   @Override
   public void writeCharacters( String text ) throws XMLStreamException
   {
      boolean useCData = XML_CHARS.matcher( text ).find();
      if( useCData )
      {
         super.writeCData( text );
      }
      else
      {
         super.writeCharacters( text );
      }
   }
}
like image 44
Michael Ernst Avatar answered Oct 13 '22 00:10

Michael Ernst


Here is the code sample referenced by the site mentioned above:

import java.io.File;
import java.io.StringWriter;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.apache.xml.serialize.OutputFormat;
import org.apache.xml.serialize.XMLSerializer;
import org.w3c.dom.Document;

public class JaxbCDATASample {

    public static void main(String[] args) throws Exception {
        // unmarshal a doc
        JAXBContext jc = JAXBContext.newInstance("...");
        Unmarshaller u = jc.createUnmarshaller();
        Object o = u.unmarshal(...);

        // create a JAXB marshaller
        Marshaller m = jc.createMarshaller();

        // get an Apache XMLSerializer configured to generate CDATA
        XMLSerializer serializer = getXMLSerializer();

        // marshal using the Apache XMLSerializer
        m.marshal(o, serializer.asContentHandler());
    }

    private static XMLSerializer getXMLSerializer() {
        // configure an OutputFormat to handle CDATA
        OutputFormat of = new OutputFormat();

        // specify which of your elements you want to be handled as CDATA.
        // The use of the '^' between the namespaceURI and the localname
        // seems to be an implementation detail of the xerces code.
        // When processing xml that doesn't use namespaces, simply omit the
        // namespace prefix as shown in the third CDataElement below.
        of.setCDataElements(
            new String[] { "ns1^foo",   // <ns1:foo>
                   "ns2^bar",   // <ns2:bar>
                   "^baz" });   // <baz>

        // set any other options you'd like
        of.setPreserveSpace(true);
        of.setIndenting(true);

        // create the serializer
        XMLSerializer serializer = new XMLSerializer(of);
        serializer.setOutputByteStream(System.out);

        return serializer;
    }
}
like image 35
ra9r Avatar answered Oct 12 '22 23:10

ra9r