Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this javascript block in Node.js?

I have the following simple http server using Node.js:

var http = require('http');

var server = http.createServer(function(req, res) {
    var counter = 0;

    for(var i = 1; i <= 30; i++) {
        http.get({ host: "www.google.com" }, function(r) {
            counter++;
            res.write("Response " + counter + ": " + r.statusCode + "\n");
            if(counter == 30) res.end();                                                                                                                                   
        });
    }
});

server.listen(8000);

When I curl into my local host on port 8000, I do get the expected result of:

Response 1: 200
Response 2: 200
Response 3: 200
...
Response 30: 200

But when I try to curl in from another terminal while the first process is running, I see the console hang and wait for the first process to finish entirely before it starts receiving the same output.

My understanding was that since this is async code using callbacks that node could handle multiple requests in sync by processing them on the next tick of the event loop. And in fact I even watched a video of Ryan Dahl doing something similar with a hello world example. What's in my code that's making the server block?

like image 376
Swift Avatar asked Aug 17 '11 03:08

Swift


People also ask

Why is JavaScript blocking?

JavaScript engine is single threaded so the language itself is synchronous and hence blocking in nature. It means any task will run completely before another can run.

What is blocking in node JS?

Blocking is when the execution of additional JavaScript in the Node. js process must wait until a non-JavaScript operation completes. This happens because the event loop is unable to continue running JavaScript while a blocking operation is occurring. In Node.

How does node js prevent blocking codes?

Node. js is a cross-platform JavaScript runtime environment that helps to execute and implement server-side programs. Node is assumed to prevent blocking code by using a single-threaded event loop.

Is node JS thread blocking?

Node js is a single-threaded and highly scalable system. Instead of separate processes and threads, it uses asynchronous, event-driven I/O operations. So It can achieve high output via single-threaded event loop and non-blocking I/O.


1 Answers

Your issue doesn't have anything to do with blocking calls; this has to do with the fact that you are only able to open a certain number of connections at a time to a single host. Once you hit the maximum number of open connections, the other asynchronous calls to http.get have to wait until the number of open connections falls again, which happens when the other requests are complete and their callbacks are fired. Since you're creating new requests faster than they drain, you get your seemingly blocking results.

Here is a modified version of your program I created to test this. (Note that there is an easier way to solve your problem, as indicated by mtomis--more on this below.) I added some console.log logging, so it is easier to tell what order things were being processed in; I also reject all requests for anything other than /, so that favicon.ico requests are ignored. Finally, I make requests to many various websites.

var http = require('http');

// http://mostpopularwebsites.net/1-50/
var sites = [
  "www.google.com", "www.facebook.com", "www.youtube.com",
  "www.yahoo.com", "www.blogspot.com", "www.baidu.com", "www.live.com",
  "www.wikipedia.org", "www.twitter.com", "www.qq.com", "www.msn.com",
  "www.yahoo.co.jp", "www.sina.com.cn", "www.google.co.in", "www.taobao.com",
  "www.amazon.com", "www.linkedin.com", "www.google.com.hk",
  "www.wordpress.com", "www.google.de", "www.bing.com", "www.google.co.uk",
  "www.yandex.ru", "www.ebay.com", "www.google.co.jp", "www.microsoft.com",
  "www.google.fr", "www.163.com", "www.google.com.br",
  "www.googleusercontent.com", "www.flickr.com"
];

var server = http.createServer(function(req, res) {
  console.log("Got a connection.");
  if(req.url != "/") {
    console.log("But returning because the path was not '/'");
    res.end();
    return;
  }

  var counter = 0;

  for(var i = 1; i <= 30; i++) {
    http.get({ host: sites[i] }, function(index, host, r) {
      counter++;
      console.log("Response " + counter + " from # " + index + " (" + host + ")");
      res.write("Response " + counter + " from # " + index + " (" + host + ")\n");
      if(counter == 30) res.end();
    }.bind(this, i, sites[i]));
  }
  console.log("Done with for loop.");
});

server.listen(8000);

I ran this program and very quickly visited the page in two different browsers (I also flushed my DNS cache, as the test was running too quickly to get good output otherwise). Here is the output:

Got a connection.
Done with for loop.
Response 1 from # 8 (www.twitter.com)
Response 2 from # 1 (www.facebook.com)
Response 3 from # 12 (www.sina.com.cn)
Response 4 from # 4 (www.blogspot.com)
Response 5 from # 13 (www.google.co.in)
Response 6 from # 19 (www.google.de)
Response 7 from # 26 (www.google.fr)
Response 8 from # 28 (www.google.com.br)
Response 9 from # 17 (www.google.com.hk)
Response 10 from # 6 (www.live.com)
Response 11 from # 20 (www.bing.com)
Response 12 from # 29 (www.googleusercontent.com)
Got a connection.
Done with for loop.
Response 13 from # 10 (www.msn.com)
Response 14 from # 2 (www.youtube.com)
Response 15 from # 18 (www.wordpress.com)
Response 16 from # 16 (www.linkedin.com)
Response 17 from # 7 (www.wikipedia.org)
Response 18 from # 3 (www.yahoo.com)
Response 19 from # 15 (www.amazon.com)
Response 1 from # 6 (www.live.com)
Response 2 from # 1 (www.facebook.com)
Response 3 from # 8 (www.twitter.com)
Response 4 from # 4 (www.blogspot.com)
Response 20 from # 11 (www.yahoo.co.jp)
Response 21 from # 9 (www.qq.com)
Response 5 from # 2 (www.youtube.com)
Response 6 from # 13 (www.google.co.in)
Response 7 from # 10 (www.msn.com)
Response 8 from # 24 (www.google.co.jp)
Response 9 from # 17 (www.google.com.hk)
Response 10 from # 18 (www.wordpress.com)
Response 11 from # 16 (www.linkedin.com)
Response 12 from # 3 (www.yahoo.com)
Response 13 from # 12 (www.sina.com.cn)
Response 14 from # 11 (www.yahoo.co.jp)
Response 15 from # 7 (www.wikipedia.org)
Response 16 from # 15 (www.amazon.com)
Response 17 from # 9 (www.qq.com)
Response 22 from # 5 (www.baidu.com)
Response 23 from # 27 (www.163.com)
Response 24 from # 14 (www.taobao.com)
Response 18 from # 5 (www.baidu.com)
Response 19 from # 14 (www.taobao.com)
Response 25 from # 24 (www.google.co.jp)
Response 26 from # 30 (www.flickr.com)
Response 20 from # 29 (www.googleusercontent.com)
Response 21 from # 22 (www.yandex.ru)
Response 27 from # 23 (www.ebay.com)
Response 22 from # 19 (www.google.de)
Response 23 from # 21 (www.google.co.uk)
Response 24 from # 28 (www.google.com.br)
Response 25 from # 25 (www.microsoft.com)
Response 26 from # 20 (www.bing.com)
Response 27 from # 30 (www.flickr.com)
Response 28 from # 22 (www.yandex.ru)
Response 28 from # 27 (www.163.com)
Response 29 from # 25 (www.microsoft.com)
Response 29 from # 26 (www.google.fr)
Response 30 from # 21 (www.google.co.uk)
Response 30 from # 23 (www.ebay.com)
Got a connection.
But returning because the path was not '/'

As you can see, other than the period of time it took me to hit Alt+Tab Enter, the callbacks are completely intermingled--asynchronous, non-blocking I/O at its finest.

[Edit]

As mtomis mentioned, the number of maximum connections you can have open per host is configurable via the global http.globalAgent.maxSockets. Simply set this to the number of concurrent connections you want to be able to handle per host, and the issue you observed disappears.

like image 150
Michelle Tilley Avatar answered Oct 02 '22 12:10

Michelle Tilley