Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is Node.js blocking?

I have used Node.js for a while now and I just realized it can be blocking. I just cannot wrap my brain around the conditions under which Node.js becomes blocking.

  • So, Node.js is single-threaded because (i) Javascript is and (ii) avoids all the multi-threaded pitfalls.
  • To do a lot of things at once, despite being single-threaded, it implements asynchronous execution. So, talking with the DB (the I/O in general) is non-blocking (because it is asynchronous).
  • But, all the incoming requests to do some work (i.e. talk with the DB) and all the results of that work that must go back to the client (i.e. send some data) they use that single thread.
  • Node.js uses the "event loop" inside that single thread to get all the requests and assign them to non-blocking I/O tasks.

So the I/O tasks are non-blocking because of asynchronous callbacks, but the single thread can be blocking, because it's synchronous and because the event loop can be jammed because a lot of complicated requests showing up at the same time?

  1. Am I right, did I understand this correctly? I, guess I don't because here and here they emphasize that "Node is single-threaded which means none of your code runs in parallel". What does that actually mean and how does it make Node blocking?
  2. So, the event loop runs forever and always searches for requests, or it starts execution after it spots a new request?
  3. Does the Node blocking weakness renders Node useless for big projects and make it eventually suitable for only micro-sites and small projects?

Thanks a lot.

like image 915
slevin Avatar asked Apr 16 '16 12:04

slevin


1 Answers

First, to be clear: node.js as a whole isn't single-threaded. Node does have a thread pool via libuv that it uses to perform some tasks that are either currently impossible to do efficiently from a single thread on most platforms (e.g. file I/O) or they are inherently computation intensive (e.g. zlib). It should be noted that most of the crypto module (which would also be inherently computation intensive) currently does not have an async/non-blocking interface (except for crypto.randomBytes()).

v8 also utilizes multiple threads to do things like garbage collection, optimization of functions, etc.

However just about everything else in node does occur within the same, single thread.

Now to address your questions specifically:

  1. The fact that the javascript code is ran from a single thread doesn't make node block. As this answer explains, node is foremost about (I/O) concurrency rather than (code) parallelism. You could run node code in parallel by utilizing the built-in cluster module for example on a multi-core/cpu system, but node's primary goal is to be able to handle a lot of I/O concurrently without dedicating one thread per socket/server/etc.

  2. There is a good, detailed writeup here that describes how the event loop in node works.

  3. Node's primary goal as previously described is to handle I/O really well, which fits with the majority of use cases for web applications and any kind of network programs for example.

    If your script is CPU-bound (e.g. you're calculating pi or transcoding audio/video), you are probably better off delegating that work to a child process in node (e.g. calling out to ffmpeg for transcoding instead of doing it in javascript or synchronously in a c++ node addon on node's main thread). You could do these blocking things in-process if you aren't doing anything else at the same time (like handling HTTP requests). There are many people who will use node in this way for performing various utility tasks where I/O concurrency isn't as important. One example of this might be a script that performs minification, linting, and/or bundling of js and css files or a script that creates thumbnails from a large set of images.

    However, if your script instead creates a TCP or HTTP server for example that pulls information from a database, formats it, and sends it back to the user, then node will be good at doing that because the majority of the time spent in the process is just waiting for sockets/HTTP clients to send (more) data and waiting for the database to reply with results from queries.

like image 150
mscdex Avatar answered Oct 09 '22 13:10

mscdex