Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concurrency limit in Q promises - node

Is there any method to limit concurrency of promises using Q promises library?

This question is kinda related to How can I limit Q promise concurrency?

but the problem is that I'm trying to do something like this:

for (var i = 0; i <= 1000; i++) {
  return Q.all([ task1(i), task2(i) ]); // <-- limit this to 2 at a time.
}

The real use case is:

  1. Fetch posts from DB
  2. Loop every post in DB like posts.forEach(function(post) {}
  3. For every post do task1, task2, task3 (retrieve social counters, retrieve comments count, etc)
  4. Save new post data in DB.

But the problem is that node is executing all tasks for all posts at the same time, like asking facebook for the "likes count" for 500 posts at the same time.

How i can limit Q.all() so only 2 posts at a time are executing their tasks? Or what other possible solutions can apply here?

Note: Most of the tasks (if not all) rely on request library

like image 815
Félix Sanz Avatar asked Mar 12 '14 23:03

Félix Sanz


People also ask

Is there any limit for Promise all?

Assuming we have the processing power and that our promises can run in parallel, there is a hard limit of just over 2 million promises.

Are promises concurrent or parallel?

In single-core CPU the promises would run concurrently and in multi-core CPU they can be executed (!) in parallel for CPU intensive tasks.

Does promise all use multiple threads?

Often Promise. all() is thought of as running in parallel, but this isn't the case. Parallel means that you do many things at the same time on multiple threads. However, Javascript is single threaded with one call stack and one memory heap.

What is Promise pool?

The promise pool ensures a maximum number of concurrently processed tasks. Each task in the promise pool is individual from others, meaning that the pool starts processing the next task as soon as one finishes. This handling ensures the best batch-processing for your tasks.


2 Answers

Thanks to Dan, his answer and his help to integrate it with my code, it can be done using his gist and a snipplet like this:

var qlimit = require('../libs/qlimit');

var test = function(id) {
  console.log('Running ' + id);
  return Q.nfcall(request, 'some dummy url which takes some time to process, for example a php file with sleep(5)').spread(function(response, body) {
    console.log('Response ' + id);
    return body;
  });
}

test = qlimit.limitConcurrency(test, 1);

var data = [0, 1, 2];

data.forEach(function(id) {
  console.log('Starting item ' + id);
  Q.all([ test(id) ]);
});

This way you get something like:

  • Starting item 0
  • Starting item 1
  • Starting item 2
  • Running 0
  • Response 0
  • Running 1
  • Response 1
  • Running 2
  • Response 2

Which clearly is 1 request at a time.

The whole point that i was missing in the implementation is that you need to re-declare the function using limitConcurrency BEFORE starting the loop, not inside it.

like image 115
Félix Sanz Avatar answered Sep 28 '22 02:09

Félix Sanz


I asked a very similar question a few days ago: Node.js/Express and parallel queues

The solution I've found (see my own answer) was to use Caolan's async. It allows you to create "operation queues", and you can limit how many can run concurrently: see the "queue" method.

In your case, Node's main loop would pull elements from Q and create a task in the queue for each of them. You could also limit this (so not to basically re-create the queue outside of Q), for example by adding N new elements to the queue only when the last one is being executed (the "empty" callback for the "queue" method).

like image 30
ItalyPaleAle Avatar answered Sep 28 '22 01:09

ItalyPaleAle