Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Node cluster.fork() forking the parent scope when implemented as a module

I'm attempting to implement a Node module which uses cluster. The problem is that the entire parent scope is forked alongside the intended cluster code. I discovered it while writing tests in Mocha for the module: the test suite will run many times, instead of once.

See below, myModule.js creates N workers, one for each CPU. These workers are http servers, or could be anything else.

Each time the test.js runs, the script runs N + 1 times. In the example below, console.log runs 5 times on my quad core.

Can someone explain if this is an implementation issue or cluster config issue? Is there any way to limit the scope of fork() without having to import a module ( as in this solution https://github.com/mochajs/mocha/issues/826 )?

/// myModule.js ////////////////////////////////////

var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;

var startCluster = function(){

  if (cluster.isMaster) {
    // CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU
    //master does not listen to UDP messages.

    for (var i = 0; i < numCPUs; i++) {

      var worker = cluster.fork();
    }


  } else {
    // Worker processes have an http server.
    http.Server(function (req, res){
      res.writeHead(200);
      res.end('hello world\n');
    }).listen(8000);

  }

  return
}

module.exports = startCluster;

/////////////////////////////////////////////////

//// test.js ////////////////////////////////////

var startCluster = require('./myModule.js')

startCluster()

console.log('hello');

////////////////////////////////////////////////////////
like image 597
LPG Avatar asked Sep 21 '16 21:09

LPG


2 Answers

So I'll venture an answer. Looking closer at the node docs there is a cluster.setupMaster which can override defaults. The default on a cluster.fork() is to execute the current script, with "file path to worker file. (Default=process.argv[1])" https://nodejs.org/docs/latest/api/cluster.html#cluster_cluster_settings

So if another module is importing a script with a cluster.fork() call, it will still use the path of process.argv[1], which may not be the path you expect, and have unintended consequences.

So we shouldn't initialize the cluster master and worker in the same file as the official docs suggest. It would be prudent to separate the worker into a new file and override the default settings. (Also for safety you can add the directory path with __dirname ).

cluster.setupMaster({ exec: __dirname + '/worker.js',});

So here would be the corrected implementation:

/// myModule.js ////////////////////////////////////

var cluster = require('cluster');
var numCPUs = require('os').cpus().length;

var startCluster = function(){
  cluster.setupMaster({ 
    exec: __dirname + '/worker.js'
  });

  if (cluster.isMaster) {
    // CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU    
    for (var i = 0; i < numCPUs; i++) {
       var worker = cluster.fork();
    }

  } 
  return
}

module.exports = startCluster;

/////////////////////////////////////////////////

//// worker.js ////////////////////////////////////
var http = require('http');

// All worker processes have an http server.
http.Server(function (req, res){
res.writeHead(200);
res.end('hello world\n');
}).listen(8000);

////////////////////////////////////////////////////////

//// test.js ////////////////////////////////////

var startCluster = require('./myModule.js')

startCluster()

console.log('hello');

////////////////////////////////////////////////////////

You should only see 'hello' once instead of 1 * Number of CPUs

like image 86
LPG Avatar answered Oct 05 '22 22:10

LPG


You need to have the "isMaster" stuff at the top of your code, not inside the function. The worker will run from the top of the module ( it's not like a C++ fork, where the worker starts at the fork() point ).

like image 27
Nick Perkins Avatar answered Oct 05 '22 22:10

Nick Perkins