Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XMLStreamReader and a real stream

Update There is no ready XML parser in Java community which can do NIO and XML parsing. This is the closest I found, and it's incomplete: http://wiki.fasterxml.com/AaltoHome

I have the following code:

InputStream input = ...;
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();

XMLStreamReader streamReader = xmlInputFactory.createXMLStreamReader(input, "UTF-8");

Question is, why does the method #createXMLStreamReader() expects to have an entire XML document in the input stream? Why is it called a "stream reader", if it can't seem to process a portion of XML data? For example, if I feed:

<root>
    <child>

to it, it would tell me I'm missing the closing tags. Even before I begin iterating the stream reader itself. I suspect that I just don't know how to use a XMLStreamReader properly. I should be able to supply it with data by pieces, right? I need it because I'm processing a XML stream coming in from network socket, and don't want to load the whole source text into memory.

Thank you for help, Yuri.

like image 791
Yuri Geinish Avatar asked Apr 16 '10 14:04

Yuri Geinish


2 Answers

You can get what you want - a partial parse, but you must not close the stream when you reach the end of the current available data. Keep the stream open, and the parser will simply block when it gets to the end of the stream. When you have more data, then add it to the stream, and the parser will continue.

This arrangement requires two threads - one thread running the parser, and another fetching data. To bridge the two threads, you use a pipe - a PipeInputStream and PipeOutputStream pair that push data from the reader thread into the input stream used by the parser. (The parser is reading data from the PipeInputStream.)

like image 127
mdma Avatar answered Oct 10 '22 13:10

mdma


If you absolutely need NIO with content "push", there are developers interested in completing API for Aalto. Parser itself is complete Stax implementation as well as alternative "push input" (feeding input instead of using InputStream). So you might instead want to check out mailing lists if you are interested. Not everyone reads StackOverflow questions. :-)

like image 39
StaxMan Avatar answered Oct 10 '22 13:10

StaxMan