Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallelizing tasks in Node.js

I have some tasks I want to do in JS that are resource intensive. For this question, lets assume they are some heavy calculations, rather then system access. Now I want to run tasks A, B and C at the same time, and executing some function D when this is done.

The async library provides a nice scaffolding for this:

async.parallel([A, B, C], D); 

If what I am doing is just calculations, then this will still run synchronously (unless the library is putting the tasks on different threads itself, which I expect is not the case). How do I make this be actually parallel? What is the thing done typically by async code to not block the caller (when working with NodeJS)? Is it starting a child process?

like image 877
Jeroen De Dauw Avatar asked Oct 01 '13 15:10

Jeroen De Dauw


People also ask

What are CPU intensive tasks in NodeJS?

It means that too many (or too long) CPU-intensive tasks could keep the main thread too busy to handle other requests, practically blocking it. The Node. js execution model was designed to cater to the needs of most web servers, which tend to be I/O-intensive.

What is parallel execution in NodeJS?

NodeJS is a runtime environment for JavaScript. It's server-side and single threaded. That being said, we want to do things asynchronously and in parallel. Now, Node uses several threads, just one execution thread, and a lot goes into it to make it asynchronous, such as queues and the libuv library.

Does node run in parallel?

Our Node. js applications are only sort of single-threaded, in reality. We can run things in parallel, but we don't create threads or sync them.

How can you achieve concurrency in NodeJS?

Node js uses an event loop to maintain concurrency and perform non-blocking I/O operations. As soon as Node js starts, it initializes an event loop. The event loop works on a queue (which is called an event queue) and performs tasks in FIFO(First In First Out) order.


1 Answers

2022 notice: this answer predates the introduction of worker threads in Node.js

How do I make this be actually parallel?

First, you won't really be running in parallel while in a single node application. A node application runs on a single thread and only one event at a time is processed by node's event loop. Even when running on a multi-core box you won't get parallelism of processing within a node application.

That said, you can get processing parallelism on multicore machine via forking the code into separate node processes or by spawning child process. This, in effect, allows you to create multiple instances of node itself and to communicate with those processes in different ways (e.g. stdout, process fork IPC mechanism). Additionally, you could choose to separate the functions (by responsibility) into their own node app/server and call it via RPC.

What is the thing done typically by async code to not block the caller (when working with NodeJS)? Is it starting a child process?

It is not starting a new process. Underneath, when async.parallel is used in node.js, it is using process.nextTick(). And nextTick() allows you to avoid blocking the caller by deferring work onto a new stack so you can interleave cpu intensive tasks, etc.

Long story short

Node doesn't make it easy "out of the box" to achieve multiprocessor concurrency. Node instead gives you a non-blocking design and an event loop that leverages a thread without sharing memory. Multiple threads cannot share data/memory, therefore locks aren't needed. Node is lock free. One node process leverages one thread, and this makes node both safe and powerful.

When you need to split work up among multiple processes then use some sort of message passing to communicate with the other processes / servers. e.g. IPC/RPC.


For more see:

Awesome answer from SO on What is Node.js... with tons of goodness.

Understanding process.nextTick()

like image 99
Matt Self Avatar answered Sep 22 '22 14:09

Matt Self