Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm run from within Node HTTP request takes much longer to run

I have a node app which plots data on an x,y dot plot graph. Currently, I make a GET request from the front end and my back end node server accepts the requests, loops through an array of data points, draws a canvas using Node Canvas and streams it back to the front end where it's displayed as a PNG image.

Complicating things is that there are can be polygons so my algorithm calculates if a point is inside a polygon, using the point in polygon package, and colors that data point differently if it is.

This works fine when there are less than 50,000 data points. However, when there are 800,000 the request takes approximately 23 seconds. I have profiled the code and most of that time is spent looping through all the data points and figuring out where to plot it on the canvas and what color (depending on if it's in one or more polygons). Here's a plunker i made. Basically i do something like this:

for (var i = 0; i < data.length; i++) {

  // get raw points
  x = data[i][0];
  y = data[i][1];

  // convert to a point on canvas
  pointX = getPointOnCanvas(x);
  pointY = getPointOnCanvas(y, 'y');

  color = getColorOfCell(pointX, pointY);

  color = color;

  plotColor.push({
      color: color,
      pointX: pointX,
      pointY : pointY
  });

}

// draw the dots down here

The algorithm itself is not the problem. The issue I have is that when the algorithm is run within a HTTP request, it takes a long time to calculate what color a point is - about 16 seconds. But if do it in chrome on the front end, it takes just over a second (see the plunker). When I run the algorithm on the command line with Node, it takes less than a second. So the fact that my app runs the algorithm within a HTTP request is slowing it down massively. So couple of questions:

Why would this be? Why does running an algorithm from within a HTTP request take so much longer?

What can I do to fix this, if anything? Would it somehow be possible to make a request to start the task, and then notify frontend when finished and retrieve the PNG?

EDIT I fully tested running the algorithm and creating a PNG through the command line. It's much quicker, less than half a second to work out what color each of the 800k data points should be. Im thinking of using socket to make a request to the server and start the task, then have it return the image. I'm baffled though why the code should take so long when run within a HTTP request...

EDIT The problem is Mongo and Mongoose. I store the coordinates of each polygon in Mongo. I fetch these coordinates once but when I compare them to each x, y point/. Somehow, this is what's massively delaying the algoritm. If I close the Mongo document, the algorithm goes from 16 seconds to 1.5 seconds......

Edit @DevDig pointed out the main problem in the comments section - when using a Mongoose object there are lots of getters and setters slowing it down. Using lean() in the query reduces algorithm from 16 seconds to 1.5 seconds

like image 942
Mark Avatar asked Jun 02 '17 12:06

Mark


People also ask

Why is Node so slow?

Node. js programs can be slow due to a CPU/IO-bound operation, such as a database query or slow API call. For most Node. js applications, data fetching is done via an API request and a response is returned.

What is NodeJS HTTP method?

The HTTP method is supplied in the request and specifies the operation that the client has requested.

When should I use HTTP request Node?

The HTTPRequest node can be used in any message flow that must send an HTTP request. The most common example is a message flow that calls a web service. For more information about web services, see Processing web service messages.

How does Node handle multiple requests at the same time?

How NodeJS handle multiple client requests? NodeJS receives multiple client requests and places them into EventQueue. NodeJS is built with the concept of event-driven architecture. NodeJS has its own EventLoop which is an infinite loop that receives requests and processes them.


1 Answers

Just finished running a version of your code as a nodeJS service. The code is taken from your plunker. Execution time was 171mSec for 100,000 rows in data (replicated first 10K rows 10 times. Here's what I did:

First, your data.json and gates.json files aren't really JSON files, they are javascript files. I removed the var data/gates = statements from the front and removed the ending semicolon. The issue you're encountering may have to do with how you're reading in your data sets in your app. Since you don't modify gates or data, I read them in as part of the set-up on the server, which is exactly how you are processing in the browser. If you need to read the files in each time you access the server, then that, of course, will change the timing. That change took the execution time from 171mSec to 515mSec - still nothing near what you're seeing. This is being executed on a macBook Pro. If needed, I can update timings from a network accessed cloud server.

getting the files:

 var fs = require("fs");
 var path = require("path");
 var data = [];
 var allGatesChain;
 var events = [];
 var x, y, pointX, pointY;

 var filename = __dirname + "/data.txt";

 data = JSON.parse(fs.readFileSync(filename, "utf-8"));
 filename = __dirname + "/gates.json";
 var gates = JSON.parse(fs.readFileSync(filename, "utf-8"));

I moved your routines to create allGatesChain and events into the exported function:

  allGatesChain = getAllGatesChain();
  generateData();
  console.log("events is "+events.length+" elements long. events[0] is: "+events[0]);
  console.log("data is "+data.length+" elements long. data[0] is "+data[0]);

and then ran your code:

  var start, end;
  var plotColor = [];
  start = new Date().getTime();
  for (var i = 0; i < data.length; i++) {
    // get raw points
    x = data[i][0];
    y = data[i][1];
    // convert to a point on canvas
    pointX = getPointOnCanvas(x);
    pointY = getPointOnCanvas(y, 'y');
    color = getColorOfCell({
      gateChain: allGatesChain,
      events: events,
      i: i
    });
    color = color;
    plotColor.push({
        color: color,
        pointX: pointX,
        pointY : pointY
    });
  }
  end = new Date().getTime();
  var _str = "loop execution took: "+(end-start)+" milliseconds.";
  console.log(_str);
  res.send(_str);

result was 171mSec.

like image 76
Bob Dill Avatar answered Oct 17 '22 00:10

Bob Dill