Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Line-oriented streams in Node.js

I'm developing a multi-process application using Node.js. In this application, a parent process will spawn a child process and communicate with it using a JSON-based messaging protocol over a pipe. I've found that large JSON messages may get "cut off", such that a single "chunk" emitted to the data listener on the pipe does not contain the full JSON message. Furthermore, small JSON messages may be grouped in the same chunk. Each JSON message will be delimited by a newline character, and so I'm wondering if there is already a utility that will buffer the pipe read stream such that it emits one line at a time (and hence, for my application, one JSON document at a time). This seems like it would be a pretty common use case, so I'm wondering if it has already been done.

I'd appreciate any guidance anyone can offer. Thanks.

like image 923
jbeard4 Avatar asked Jul 04 '11 16:07

jbeard4


2 Answers

Maybe Pedro's carrier can help you?

Carrier helps you implement new-line terminated protocols over node.js.

The client can send you chunks of lines and carrier will only notify you on each completed line.

like image 83
Alfred Avatar answered Oct 23 '22 04:10

Alfred


My solution to this problem is to send JSON messages each terminated with some special unicode character. A character that you would never normally get in the JSON string. Call it TERM.

So the sender just does "JSON.stringify(message) + TERM;" and writes it. The reciever then splits incomming data on the TERM and parses the parts with JSON.parse() which is pretty quick. The trick is that the last message may not parse, so we simply save that fragment and add it to the beginning of the next message when it comes. Recieving code goes like this:

        s.on("data", function (data) {
        var info = data.toString().split(TERM);
        info[0] = fragment + info[0];
        fragment = '';

        for ( var index = 0; index < info.length; index++) {
            if (info[index]) {
                try {
                    var message = JSON.parse(info[index]);
                    self.emit('message', message);
                } catch (error) {
                    fragment = info[index];
                    continue;
                }
            }
        }
    });

Where "fragment" is defined somwhere where it will persist between data chunks.

But what is TERM? I have used the unicode replacement character '\uFFFD'. One could also use the technique used by twitter where messages are separated by '\r\n' and tweets use '\n' for new lines and never contain '\r\n'

I find this to be a lot simpler than messing with including lengths and such like.

like image 32
zicog Avatar answered Oct 23 '22 03:10

zicog