Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to transfer/stream big data from/to child processes in node.js without using the blocking stdio?

I have a bunch of (child)processes in node.js that need to transfer large amounts of data.

When I read the manual it says the the stdio and ipc inferface between them are blocking, so that won't do.

I'm looking into using file descriptors but I cannot find a way to stream from them (see my other more specific question How to stream to/from a file descriptor in node?)

I think I might use a net socket, but I fear that has unwanted overhead.

I also see this but it not the same (and has no answers: How to send huge amounts of data from child process to parent process in a non-blocking way in Node.js?)

like image 926
Bartvds Avatar asked Jul 05 '14 01:07

Bartvds


People also ask

How does Node.js handle child process?

Usually, Node. js allows single-threaded, non-blocking performance but running a single thread in a CPU cannot handle increasing workload hence the child_process module can be used to spawn child processes. The child processes communicate with each other using a built-in messaging system.

What is one advantage of the spawn method over the other child_process methods?

By default, the spawn function does not create a shell to execute the command we pass into it. This makes it slightly more efficient than the exec function, which does create a shell.

Which of the following methods is used to switch between modes in readable stream?

One of the ways of switching the mode of a stream to flowing is to attach a 'data' event listener. A way to switch the readable stream to a flowing mode manually is to call the stream. resume method.


1 Answers

I found a solution that seems to work: when spawning the child process you can pass options for stdio and setup a pipe to stream data.

The trick is to add an additional element, and set it to 'pipe'.

In the parent process stream to child.stdio[3].

var opts = {
    stdio: [process.stdin, process.stdout, process.stderr, 'pipe']
};
var child = child_process.spawn('node', ['./child.js'], opts);

// send data
mySource.pipe(child.stdio[3]);

//read data
child.stdio[3].pipe(myHandler);

In de child open stream for file descriptor 3.

// read from it
var readable = fs.createReadStream(null, {fd: 3});

// write to it
var writable = fs.createWriteStream(null, {fd: 3});

Note that not every stream you get from npm works correctly, I tried JSONStream.stringify() but it created errors, but it worked after I piped it via through2. (no idea why that is).

Edit: some observations: it seems the pipe is not always Duplex stream, so you might need two pipes. And there is something weird going on where in one case it only works if I also have a ipc channel, so 6 total: [stdin, stdout, stderr, pipe, pipe, ipc].

like image 74
Bartvds Avatar answered Nov 07 '22 19:11

Bartvds