Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading json stream without blocking

I want to be able to read stream (from a socket) of json messages using Jackson (2).

There are ways to pass a Reader as the source, such as doing:

ObjectMapper mapper = new ObjectMapper();
MyObject obj = mapper.readValue(aReader, MyObject.class);

but that will block until the entire json message has arrived and I want to avoid that.

Is there a way to have a buffer to which I can keep adding bytes with the ability to ask if the buffer contains a full json representation of a specific class?
Something like:

JsonBuffer buffer = new JsonBuffer(MyObject.class);
...
buffer.add(readBytes);
if (buffer.hasObject()) {
    MyObject obj = buffer.readObject();
}

Thanks.

like image 327
Nitzan Tomer Avatar asked Jun 02 '12 11:06

Nitzan Tomer


People also ask

How do I read a JSON stream?

JsonReader is an input reader that can read a JSON stream. It can be created using a Reader object as demonstrated in this code or using a File corresponding to the JSON stream.

What is the difference between a streamwriter and a jsonreader?

A StreamWriter is created to write to the console. Data is transferred from the JSON stream to the console via JobTemplate.DEFAULT.transfer method. JsonReader is an input reader that can read a JSON stream. It can be created using a Reader object as demonstrated in this code or using a File corresponding to the JSON stream.

What is streaming parsing for JSON?

One common solution is streaming parsing, aka lazy parsing, iterative parsing, or chunked processing . Let’s see how you can apply this technique to JSON processing. For illustrative purposes, we’ll be using this JSON file, large enough at 24MB that it has a noticeable memory impact when loaded.

Why can’t we just load the whole JSON file?

It’s clear that loading the whole JSON file into memory is a waste of memory. With a larger file, it would be impossible to load at all. Given a JSON file that’s structured as a list of objects, we could in theory parse it one chunk at a time instead of all at once. The resulting API would probably allow processing the objects one at a time.


3 Answers

Jackson supports non-blocking JSON stream parsing as of 2.9. You can find an example about how to use it in Spring Framework 5 Jackson2Tokenizer.

like image 135
Sébastien Deleuze Avatar answered Sep 21 '22 10:09

Sébastien Deleuze


(I know this thread is old, but since there is no accepted answer, I wanted to add mine, just in case anyone still reads this).

I just published a new library called Actson (https://github.com/michel-kraemer/actson). It works almost like the OP suggested. You can feed it with bytes until it returns one or more JSON events. When it has consumed all input data, you feed it with more bytes and get the next JSON events. This process continues until the JSON text has been fully consumed.

If you know Aalto XML (https://github.com/FasterXML/aalto-xml) then you should be able to familiarise yourself with Actson quickly because the interface is almost the same.

Here's a quick example:

// JSON text to parse
byte[] json = "{\"name\":\"Elvis\"}".getBytes(StandardCharsets.UTF_8);

JsonParser parser = new JsonParser(StandardCharsets.UTF_8);

int pos = 0; // position in the input JSON text
int event; // event returned by the parser
do {
    // feed the parser until it returns a new event
    while ((event = parser.nextEvent()) == JsonEvent.NEED_MORE_INPUT) {
        // provide the parser with more input
        pos += parser.getFeeder().feed(json, pos, json.length - pos);

        // indicate end of input to the parser
        if (pos == json.length) {
            parser.getFeeder().done();
        }
    }

    // handle event
    System.out.println("JSON event: " + event);
    if (event == JsonEvent.ERROR) {
        throw new IllegalStateException("Syntax error in JSON text");
    }
} while (event != JsonEvent.EOF);
like image 37
Michel Krämer Avatar answered Sep 21 '22 10:09

Michel Krämer


You can use JsonParser to get individual events/tokens (which is what ObjectMapper uses internally), and this allows more granular access. But all current functionality uses blocking IO, so there is no way to so-called non-blocking (aka "async") parsing.

EDIT: 2019-09-18 -- correction: Jackson 2.9 (https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9) added support for non-blocking/async JSON parsing (issue https://github.com/FasterXML/jackson-core/issues/57)

like image 37
StaxMan Avatar answered Sep 21 '22 10:09

StaxMan