INFO - Sample code
I've set up sample code (SSCCE) for you to help track the problem:
https://github.com/ljader/test-cxf-base64-marshall
The problem
I'm integrating with 3rd party JAX-WS service, so I cannot change the WSDL.
The 3rd party webservice expects Base64 encoded bytes to perform some operation on them - they expect that client sends whole bytes in SOAP message. They don't want to change to MTOM / XOP, so I'm stuck with current requirements.
I decided to use CXF to easily set up sample client, and it worked ok for small files.
But when I try to send BIG data, i.e. 200MB, the CXF/JAXB throws an exception:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at com.sun.xml.bind.v2.util.ByteArrayOutputStreamEx.readFrom(ByteArrayOutputStreamEx.java:75)
at com.sun.xml.bind.v2.runtime.unmarshaller.Base64Data.get(Base64Data.java:196)
at com.sun.xml.bind.v2.runtime.unmarshaller.Base64Data.writeTo(Base64Data.java:312)
at com.sun.xml.bind.v2.runtime.output.UTF8XmlOutput.text(UTF8XmlOutput.java:312)
at com.sun.xml.bind.v2.runtime.XMLSerializer.leafElement(XMLSerializer.java:356)
at com.sun.xml.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl$PcdataImpl.writeLeafElement(RuntimeBuiltinLeafInfoImpl.java:191)
at com.sun.xml.bind.v2.runtime.MimeTypedTransducer.writeLeafElement(MimeTypedTransducer.java:96)
at com.sun.xml.bind.v2.runtime.reflect.TransducedAccessor$CompositeTransducedAccessorImpl.writeLeafElement(TransducedAccessor.java:254)
at com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty.serializeBody(SingleElementLeafProperty.java:130)
at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.serializeBody(ClassBeanInfoImpl.java:360)
at com.sun.xml.bind.v2.runtime.XMLSerializer.childAsXsiType(XMLSerializer.java:696)
at com.sun.xml.bind.v2.runtime.ElementBeanInfoImpl$1.serializeBody(ElementBeanInfoImpl.java:155)
at com.sun.xml.bind.v2.runtime.ElementBeanInfoImpl$1.serializeBody(ElementBeanInfoImpl.java:130)
at com.sun.xml.bind.v2.runtime.ElementBeanInfoImpl.serializeBody(ElementBeanInfoImpl.java:332)
at com.sun.xml.bind.v2.runtime.ElementBeanInfoImpl.serializeRoot(ElementBeanInfoImpl.java:339)
at com.sun.xml.bind.v2.runtime.ElementBeanInfoImpl.serializeRoot(ElementBeanInfoImpl.java:75)
at com.sun.xml.bind.v2.runtime.XMLSerializer.childAsRoot(XMLSerializer.java:494)
at com.sun.xml.bind.v2.runtime.MarshallerImpl.write(MarshallerImpl.java:323)
at com.sun.xml.bind.v2.runtime.MarshallerImpl.marshal(MarshallerImpl.java:251)
at javax.xml.bind.helpers.AbstractMarshallerImpl.marshal(AbstractMarshallerImpl.java:95)
at org.apache.cxf.jaxb.JAXBEncoderDecoder.writeObject(JAXBEncoderDecoder.java:617)
at org.apache.cxf.jaxb.JAXBEncoderDecoder.marshall(JAXBEncoderDecoder.java:241)
at org.apache.cxf.jaxb.io.DataWriterImpl.write(DataWriterImpl.java:237)
at org.apache.cxf.interceptor.AbstractOutDatabindingInterceptor.writeParts(AbstractOutDatabindingInterceptor.java:117)
at org.apache.cxf.wsdl.interceptors.BareOutInterceptor.handleMessage(BareOutInterceptor.java:68)
at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
at org.apache.cxf.endpoint.ClientImpl.doInvoke(ClientImpl.java:514)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:423)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:324)
at org.apache.cxf.endpoint.ClientImpl.invoke(ClientImpl.java:277)
at org.apache.cxf.frontend.ClientProxy.invokeSync(ClientProxy.java:96)
at org.apache.cxf.jaxws.JaxWsClientProxy.invoke(JaxWsClientProxy.java:139)
My findings
I've tracked bug, that based on xsd type "base64Binary", the
com.sun.xml.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl
decides, that
com.sun.xml.bind.v2.runtime.unmarshaller.Base64Data
should handle marshalling of data from
javax.activation.DataHandler
During marshalling, the WHOLE data from underlying InputStream is trying to be read http://grepcode.com/file/repo1.maven.org/maven2/com.sun.xml.bind/jaxb-impl/2.2.11/com/sun/xml/bind/v2/runtime/unmarshaller/Base64Data.java/#311, which causes OOME exception.
Problem
CXF uses JAXB during marshalling Java objects into SOAP messages - when marshalling InputStream, the WHOLE input stream is read to memory before beeing converted into Base64 binary.
So I want to send ("stream") data from client to server in chunks (since the OutputSteam in marshaller is wrapped direct HttpURLConnection), so my client could can handle sending any amount of data.
Especially when many threads would be using my client, the streaming is IMHO very desirable.
I don't have good JAX-WS/CXF/JAXB knowledge, hence the question.
The only materials which I found and may be usefull are:
Can JAXB parse large XML files in chunks
http://rezarahim.blogspot.com/2010/05/chunking-out-big-xml-with-stax-and-jaxb.html
The questions
Why CXF/JAXB loads whole InputStream into memory - is not the DataHandler purpouse to prevent such implementation?
Do you know any way to change JAXB behaviour to differently marshall InputStream?
Do you know different marshallers, which can handle such big data marshalling?
As a last resort, maybe you have links to some materials, how to create custom marshaller which would stream the data directly to the server?
You don't need any custom marshallers or change JAXB behaviour to achieve what you need - DataHandler is your friend here.
Answering your first question: JAXB needs to keep all data in memory because it has to resolve references.
I know you can't change the WSDL references, etc. But still you do have your client's WSDL in your project in order to generate client classes, don't you? So what you can do (I haven't tested this with third party's WSDL but might be worth trying) is to add xmime:expectedContentTypes="application/octet-stream"
into the response XSD element which returns Base64 encoded data. For e.g.:
<xsd:element name="generateBigDataResponse">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="result"
type="xsd:base64Binary"
minOccurs="0"
maxOccurs="1"
xmime:expectedContentTypes="application/octet-stream"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
Also do not forget to add namespace: xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
in the xsd:schema
element.
What you are doing here - is not changing any WSDL references, just telling JAXB instead of generating byte[]
to generate DataHandler
. So what happens when you generate your client classes like that:
@Override
public DataHandler generateBigData() {
try {
final PipedOutputStream pipedOutputStream = new PipedOutputStream();
PipedInputStream pipedInputStream = new PipedInputStream(pipedOutputStream);
InputStreamDataSource dataSource = new InputStreamDataSource(pipedInputStream, "application/octet-stream");
executor.execute(new Runnable() {
@Override
public void run() {
//write your stuff here into pipedOutputStream
}
});
return new DataHandler(dataSource);
} catch (IOException e) {
//handle exception if any
}
}
You get DataHandler
as a response type thanks to xmime
. I suggest you use PipedOutputStream, but make sure do the writing in a different thread:
A piped output stream can be connected to a piped input stream to create a communications pipe. The piped output stream is the sending end of the pipe. Typically, data is written to a PipedOutputStream object by one thread and data is read from the connected PipedInputStream by some other thread. Attempting to use both objects from a single thread is not recommended as it may deadlock the thread. The pipe is said to be broken if a thread that was reading data bytes from the connected piped input stream is no longer alive.
Then you connecting it with the PipedInputStream which instance goes into constructor of InputStreamDataSource which you then pass into DataHandler
and return DataHandler
's instance. This way your file will be written in chunks and you won't get that exception, more - client will never get the timeout.
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With