Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Q - executing a series of promises and defining dependencies between them in a DAG

I would like to process a series of data, where the output of each may be used as inputs into the others.

For example:

    var batch = [
        {"id":"a1","depends":[],"data":{"some":"data a1"}},
        {"id":"b1","depends":["a1"],"data":{"some":"data b1"}},
        {"id":"b2","depends":["a1"],"data":{"some":"data b2"}},
        {"id":"c1","depends":["b1","b2"],"data":{"some":"data c1"}},
        {"id":"x1","depends":[],"data":{"some":"data x1"}},
    ];

This means that once a1 is complete, its output will be sent to both b1 and b2; and when these complete, both of their output will be sent to c1 (only upon both of their completion. x1 may execute in parallel with all of a1, b1, b2, and c1; and b1 may execute in parallel with b2, as no depends between them are defined.

Upon completion of c1 and x1, and therefore the completion of all 5 of them, the output of all five should be returned.

We will assume that no circular dependencies are defined, and thus is a directed acyclic graph (DAG)

I would like to know how to implement this using Q, because:

  • All the processing of the data will be asynchronous, and thus I will need to use either callbacks, or deferreds and promises; and I prefer the latter
  • Promises can double up as a convenient way to define the edges in the graph

However, I have not been able to take this past the conceptual stage

var doPromises = {};

var doData = function(data, dependsResultsHash, callback) {
  //Not real processing, simply echoes input after a delay for async simulation purposes
  var out = {
    echo: {
      data: data,
      dependsResultsHash: dependsResultsHash
    }
  };
  setTimeout(function() {
    callback(out);
  }, 1000);
};

var doLine = function(id, depIds, data) {
  var deferred = Q.defer;
  var dependsPromises = [];
  for (var i = 0; i < depIds.length; ++i) {
    var depId = depIds[i];
    dependPromise = doPromises[depId];
    dependsPromises.push(dependPromise);
  }
  Q.all(dependsPromises).then(function(dependsResults) {
    var dependsResultsHash = {};
    for (var i = 0; i < depIds.length; ++i) {
      var depId = depIds[i];
      var depResult = dependsResults[i];
      dependsResultsHash[depId] = depResult;
    }
    doData(data, dependsResultsHash, function(result) {
      deferred.resolve(result);
    });
  });
  return deferred.promise;
}

var doBatch = function(batch) {
  var linePromises = [];
  for (var i = 0; i < batch.length; ++i) {
    var line = batch[i];
    var linePromise = doLine(line.id, line.depends, line.data);
    linePromises.push(linePromise);
    doPromises[line.id] = linePromise;
  }
  Q.all(linePromises).then(function(lineResults) {
    console.log(lineResults);
    deferred.resolve(lineResults);
  });
};

doBatch(batch);

(Note that this code is untested and I do not expect it to work, just to illustrate the points necessary for my question.)

I would like to know:

  • Am I doing this right? Am I completely missing the point with the Q library. or with deferreds and promises?
  • My main concern is with the doData function:

    -- Is the way that I have selected the promises of the lines depended upon from the global list of promises `doPromises` ok?
    -- Is the way that I have obtained the results of the lines depended upon, and inpterpreted that OK?
    
  • With the doBatch function:

    -- I have a local array for `linePromises` and an external hash for `doPromises`, and I feel that these should be combined. How can I do this correctly?
    
  • General

    -- The code above presently assumes that all `deferred`s will eventually keep their `promise`s. What if they fail or throw an exception; how do I make my code more robust in handling this?
    -- I have used a closure allow acces to `doPromises` in both `doBatch` and `doLine`, and it seems a little odd here, is there a better way to do this?
    
like image 480
bguiz Avatar asked Oct 21 '22 06:10

bguiz


2 Answers

I have created a library that does this:

qryq is a NodeJs library that allows one to express a series of queries and define dependencies between them either in parallel, in sequence, or in a directed acyclic graph.

like image 174
bguiz Avatar answered Oct 23 '22 21:10

bguiz


I've recently made a module called dagmise that I am planning on using to make a build system that uses promises as tasks. I ended up making the nodes of the graph functions that return promises. When a node is visited the function at it is evaluated and the returned promised takes its place as the node's value. So even if a node is visited multiple times, the function is only executed once.

I started with the idea that the promises should be the edges, but now I think it is simpler to have them at the nodes. Otherwise you really end up having two kinds of objects in your graphs (nodes/states and edges/promises), which complicates things a bit.

like image 25
spelufo Avatar answered Oct 23 '22 21:10

spelufo