Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I parse the first JSON object on a stream in JS

I have a stream of JSON objects, as with JSON-RPC over TCP or WebSockets. There's no length prefix or delimiter, because JSON is self-delimiting. So, when I read from the stream, I may end up with something like this:

{"id":1,"result":{"answer":23},"error":null}
{"id":2,"result":{"answer":42},"error":null}
{"id":3,"result":{"answ

I need to parse each JSON object one by one. I can't do this with JSON.parse, because it will just throw a syntax error for extraneous data at the end.

Of course with that example I could go line by line, but I can't rely on the whitespace looking like that; JSON-RPC can just as easily look like this:

{
  "id": 1, 
  "result": {
    "answer": 23
  },
  "error":null
} 

Or this:

{"id":1,"result":{"answer":23},"error":null}{"id":2,"result":{"answer":42},"error":null}

With most parsers in other languages, the obvious answer is something like this (using Python as an example):

buf = ''
decoder = json.JSONDecoder()
def onReadReady(sock):
  buf += sock.read()
  obj, index = decoder.raw_decode(buf)
  buf = buf[index:]
  if obj:
    dispatch(obj)

But I can't find anything similar in JS. I've looked at every JS parser I can find, and they're all effectively equivalent to JSON.parse.

I tried looking at various JSON-RPC frameworks to see how they handle this problem, and they just don't. Many of them assume that a recv will always return exactly one send (which works fine for JSON-RPC over HTTP, but not over TCP or WebSockets—although it may appear to work in local tests, of course). Others don't actually handle JSON-RPC because they add requirements on whitespace (some of which aren't even valid for JSON-RPC).

I could write a delimiter check that balances brackets and quotes (handling escaping and quoting, of course), or just write a JSON parser from scratch (or port one from another language, or modify http://code.google.com/p/json-sans-eval/), but I can't believe no one has done this before.

EDIT: I've made two versions myself, http://pastebin.com/fqjKYiLw based on json-sans-eval, and http://pastebin.com/8H4QT82b based on Crockford's reference recursive descent parser json_parse.js. I would still prefer to use something that's been tested and used by other people rather than coding it myself, so I'm leaving this question open.

like image 984
abarnert Avatar asked Mar 22 '12 20:03

abarnert


2 Answers

After a month of searching for alternatives and not finding anything useful, I decided to code up a bunch of different implementations and test them out, and I went with my modification of Crockford's reference recursive-descent parser (as described in the question, available here).

It wasn't the fastest, but it was more than fast enough in every test I did. More importantly, it catches clearly erroneous JSON, when that's not ambiguous with incomplete JSON, much better than most of the other alternatives. Most importantly, it required very few, and pretty simple, changes from a well-known and -tested codebase, which makes me more confident in its correctness.

Still, if anyone knows of a better library than mine (and just being used by lots of projects instead of just me would count as a major qualification), I'd love to know about it.

like image 149
abarnert Avatar answered Oct 07 '22 23:10

abarnert


Here is a simple JSON Object separator. It assumes that you receive a series of JSON objects (not array) and that are well formed.

function JSONObjectSepaator() {

    this.onObject = function (JSONStr) {};

    this.reset = function () {
        this.brace_count = 0;
        this.inString = false;
        this.escaped = false;
        this.buffer = "";
    };

    this.receive = function (S) {
        var i;
        var pos=0;
        for (i = 0; i < S.length; i++) {
            var c = S[i];
            if (this.inString) {
                if (this.escaped) {
                    this.escaped = false;
                } else {
                    if (c == "\\") {
                        this.escaped = true;
                    } else if (c == "\"") {
                        this.inString = false;
                    }
                }
            } else {
                if (c == "{") {
                    this.brace_count++;
                } else if (c == "}") {
                    this.brace_count--;
                    if (this.brace_count === 0) {
                        this.buffer += S.substring(pos,i+1);
                        this.onObject(this.buffer);
                        this.buffer = "";
                        pos=i+1;
                    }
                } else if (c == "\"") {
                    this.inString = true;                   
                } 
            }
        }
        this.buffer += S.substring(pos);
    };

    this.reset();
    return this;
}

To use it, you can do it this way:

var separator = new JSONObjectSepaator();
separator.onObject = function (o) {
    alert("Object received: "+o);
};

separator.receive('{"id":1,"result":{"answer":23},"error":null, "x');
separator.receive('x":"\\\""}{"id":2,"result":{"answer":42},"error":null}{"id":');
separator.receive('3,"result":{"answer":43},"err{or":3}');
like image 41
jbaylina Avatar answered Oct 07 '22 23:10

jbaylina