Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using cluster in a Node module

UPDATE: Even if this particular scenario is not realistic, as per comments, I'm still interested in how one could write a module that makes use of clustering without rerunning the parent process each time.


I'm trying to write a Node.js module called mass-request that speeds up large numbers of HTTP requests by distributing them to child processes.

My hope is that, on the outside, it work like this.

var mr = require("mass-request"),
    scraper = mr();

for (var i = 0; i < my_urls_to_visit.length; i += 1) {
    scraper.add(my_urls_to_visit[i], function(resp) {
        // do something with response
    }
}

To get started, I put together a skeleton for the mass-request module.

var cluster = require("cluster"),
    numCPUs = require("os").cpus().length;

module.exports = function() {
    console.log("hello from mass-request!");
    if (cluster.isMaster) {
        for (var i = 0; i < numCPUs; i += 1) {
            var worker = cluster.fork();             
        }

        return {
            add: function(url, cb) {}       
        }       
    } else {
        console.log("worker " + process.pid + " is born!");
    }  
}

Then I test it like so in a test script:

var m = mr();
console.log("hello from test.js!", m);

I expected to see "hello from mass-request!" logged four times (as indeed it is). To my amazement, I also see "hello from test.js" four times. Clearly I do not understand how cluster.fork() works. Is it rerunning the whole process, not just the function that call it the first time?

If so, how does one make use of clustering in a module without troubling the person who uses that module with messy multi-process logic?

like image 867
Chris Wilson Avatar asked May 20 '14 23:05

Chris Wilson


People also ask

When would you use cluster module in node JS?

Node. js runs single threaded programming, which is very memory efficient, but to take advantage of computers multi-core systems, the Cluster module allows you to easily create child processes that each runs on their own single thread, to handle the load.

Is it possible to cluster multiple node processes?

A cluster module executes the same Node. js process multiple times. Therefore, the first thing you need to do is to identify what portion of the code is for the master process and what portion is for the workers.

How clustering works in Nodejs?

The cluster module enables creating child processes (workers) that run simultaneously while sharing the same server port. Every child process has its own event loop, memory, and V8 instance. The child processes use interprocess communication to communicate to the main parent Node. js process.

Is a cluster the same as a node?

Clusters and NodesA cluster is made up of nodes that run containerized applications. Each cluster also has a master (control plane) that manages the nodes and pods (more on pods below) of the cluster.


1 Answers

I believe what you are looking for is in setupMaster

From the docs:

cluster.setupMaster([settings])

  • settings Object
    • exec String file path to worker file. (Default=process.argv[1])
    • args Array string arguments passed to worker. (Default=process.argv.slice(2))
    • silent Boolean whether or not to send output to parent's stdio. (Default=false)

setupMaster is used to change the default 'fork' behavior. Once called, the settings will be present in cluster.settings

By making use of the exec property you can have your workers launched from a different module.

Important: as the docs state, this can only be called once. If you are depending on this behavior for your module, then the caller can't be using cluster or the whole thing falls apart.

For example:

index.js

var cluster = require("cluster"),
  path = require("path"),
  numCPUs = require("os").cpus().length;

console.log("hello from mass-request!");
if (cluster.isMaster) {
  cluster.setupMaster({
    exec: path.join(__dirname, 'worker.js')
  });

  for (var i = 0; i < numCPUs; i += 1) {
    var worker = cluster.fork();
  }

  return {
    add: function (url, cb) {
    }
  }
} else {
  console.log("worker " + process.pid + " is born!");
}

worker.js

console.log("worker " + process.pid + " is born!");

output

node index.js 
hello from mass-request!
worker 38821 is born!
worker 38820 is born!
worker 38822 is born!
worker 38819 is born!
like image 140
dc5 Avatar answered Sep 28 '22 05:09

dc5