I've discovered this idiom recently, and I am wondering if there is something I am missing. I've never seen it used. Nearly all Java code I've worked with in the wild favors slurping data into a string or buffer, rather than something like this example (using HttpClient and XML APIs for example):
final LSOutput output; // XML stuff initialized elsewhere final LSSerializer serializer; final Document doc; // ... PostMethod post; // HttpClient post request final PipedOutputStream source = new PipedOutputStream(); PipedInputStream sink = new PipedInputStream(source); // ... executor.execute(new Runnable() { public void run() { output.setByteStream(source); serializer.write(doc, output); try { source.close(); } catch (IOException e) { throw new RuntimeException(e); } }}); post.setRequestEntity(new InputStreamRequestEntity(sink)); int status = httpClient.executeMethod(post);
That code uses a Unix-piping style technique to prevent multiple copies of the XML data being kept in memory. It uses the HTTP Post output stream and the DOM Load/Save API to serialize an XML Document as the content of the HTTP request. As far as I can tell it minimizes the use of memory with very little extra code (just the few lines for Runnable
, PipedInputStream
, and PipedOutputStream
).
So, what's wrong with this idiom? If there's nothing wrong with this idiom, why haven't I seen it?
EDIT: to clarify, PipedInputStream
and PipedOutputStream
replace the boilerplate buffer-by-buffer copy that shows up everywhere, and they also allow you to process incoming data concurrently with writing out the processed data. They don't use OS pipes.
Creates a piped output stream connected to the specified piped input stream. Data bytes written to this stream will then be available as input from snk .
The piped input stream contains a buffer, decoupling read operations from write operations, within limits. A pipe is said to be broken if a thread that was providing data bytes to the connected piped output stream is no longer alive.
From the Javadocs:
Typically, data is read from a PipedInputStream object by one thread and data is written to the corresponding PipedOutputStream by some other thread. Attempting to use both objects from a single thread is not recommended, as it may deadlock the thread.
This may partially explain why it is not more commonly used.
I'd assume another reason is that many developers do not understand its purpose / benefit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With