Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to limit (or queue) calls to external processes in Node.JS?

Scenario

I have a Node.JS service (written using ExpressJS) that accepts image uploads via DnD (example). After an image is uploaded, I do a few things to it:

  1. Pull EXIF data from it
  2. Resize it

These calls are being handled via the node-imagemagick module at the moment and my code looks something like this:

app.post('/upload', function(req, res){
  ... <stuff here> ....

  im.readMetadata('./upload/image.jpg', function(err, meta) {
      // handle EXIF data.
  });

  im.resize(..., function(err, stdout, stderr) {
      // handle resize.
  });
});

Question

As some of you already spotted, the problem is that if I get enough simultaneous uploads, every single one of those uploads will spawn an 'identity' call then a resize operation (from Image Magick), effectively killing the server under high load.

Just testing with ab -c 100 -n 100 locks my little 512 Linode dev server up such that I have to force a reboot. I understand that my test may just be too much load for the server, but I would like a more robust approach to processing these requests so I have a more graceful failure then total VM suicide.

In Java I solved this issue by creating a fixed-thread ExecutorService that queues up the work and executes it on at most X number of threads.

In Node.JS, I am not even sure where to start to solve a problem like this. I don't quite have my brain wrapped around the non-threaded nature and how I can create a async JavaScript function that queues up the work while another... (thread?) processes the queue.

Any pointers on how to think about this or how to approach this would be appreciated.

Addendum

This is not the same as this question about FFMpeg, although I imagine that person will have this exact same question as soon as his webapp is under load as it boils down to the same problem (firing off too many simultaneous native processes in parallel).

like image 892
Riyad Kalla Avatar asked Sep 02 '11 23:09

Riyad Kalla


People also ask

How do I limit a NodeJS request?

Copy and paste the following code inside this file: // src/middlewares/rateLimiter. js import rateLimit from 'express-rate-limit'; export const rateLimiterUsingThirdParty = rateLimit({ windowMs: 24 * 60 * 60 * 1000, // 24 hrs in milliseconds max: 100, message: 'You have exceeded the 100 requests in 24 hrs limit!

How do I handle a queue in NodeJS?

Check queue (also known as immediate queue) The callback functions in this queue are executed immediately after all callback functions in the IO queue have been executed. setImmediate is the function used to add functions to this queue. For example: const fs = require('fs'); setImmediate(function() { console.

What is throttling in NodeJS?

Which means it allow you send HTTP request with a rate limit. Here is an example: A simple Node. js server, add express-rate-limit middleware to API so that the API has rate-limit feature.

Why NodeJS is not good for CPU intensive applications?

The reason Nodejs is bad for CPU intensive task is that it runs on the event loop, which runs on a single thread. The event loop is responsible for everything that runs on the user-land of Nodejs. This event loop runs on a single thread.


2 Answers

The threads module should be just what you need:

https://github.com/robtweed/threads

like image 94
Rob Avatar answered Oct 06 '22 01:10

Rob


Since Node does not allow threading, you can do work in another process. You can use a background job system, like resque, where you queue up jobs to be handled into a datastore of some type and then run a process (or several processes) that pulls jobs from the datastore and does the processing; or use something like node-worker and queue your jobs into the workers memory. Either way, your main application is freed up from doing all the processing and can focus on serving web requests.

[Update] Another interesting library to check out is hook.io, especially if you like the idea of node-workers but want to run multiple background processes. [/Update]

[Edit]

Here's a quick and dirty example of pushing work that takes a while to run to a worker process using node-worker; the worker queues jobs and processes them one by one:

app.js

var Worker = require('worker').Worker;
var processor = new Worker('image_processor.js');

for(var i = 0; i <= 100; i++) {
  console.log("adding a new job");
  processor.postMessage({job: i});
}

processor.onmessage = function(msg) {
  console.log("worker done with job " + msg.job);
  console.log("result is " + msg.data.result);
};

image_processor.js

var worker = require('worker').worker;
var queue = [];

worker.onmessage = function(msg) {
  var job = msg.job;
  queue.push(job);
}

var process_job = function() {
  if(queue.length == 0) {
    setTimeout(process_job, 100);
    return;
  }

  var job = queue.shift();
  var data = {};

  data.result = job * 10;

  setTimeout(function() {
    worker.postMessage({job: job, data: data});
    process_job();
  }, 1000);
};

process_job();
like image 40
Michelle Tilley Avatar answered Oct 06 '22 00:10

Michelle Tilley