I am trying to understand Threading
in NodeJS and how it works.
Currently what i understand:
Cluster: -
Child_process:
Make use also of different cores available, but its bad since it costs huge amount of resources to fork a child process since it creates virtual memory.
Forked processes could communicate with the master thread through events and vice versa, but there is no communication between forked processes.
Worker threads:
bufferArray
1) Why worker threads
is better than child process
and when we should use each of them?
2) What would happen if we have 4 cores and clustered/forked nodeJS webserver 4 times(1 process for each core), then we used worker threads
(There is no available cores) ?
People use the word "worker" when they mean a thread that does not own or interact with UI. Threads that do handle UI are called "UI" threads. Usually, your main (primary) thread will be the thread that owns and manages UI. And then you start one or more worker threads that do specific tasks.
Worker thread is a continuous parallel thread that runs and accepts messages until the time it is explicitly closed or terminated. Messages to a worker thread can be sent from the parent thread or its child worker threads. Through out this document, parent thread is referred as thread where a worker thread is spawned.
Worker threads provide us with a way to run multiple threads under a single process. Apart from keeping the Event Loop free of time consuming CPU operations, we can also use a pool of worker threads to divide and execute heavy CPU operations in parallel to increase the performance of our Node. js application.
Threads share memory (e.g. SharedArrayBuffer ) whereas processes don't. Essentially they are the same thing categorically. cluster. One process is launched on each CPU and can communicate via IPC. Each process has it's own memory with it's own Node (v8) instance.
You mentioned point under worker-threads that they are same in nature to child-process. But in reality they are not.
Process has its own memory space on other hand, threads use the shared memory space.
Thread is part of process. Process can start multiple threads. Which means that multiple threads started under process share the memory space allocated for that process.
I guess above point answers your 1st question why thread model is preferred over the process.
2nd point: Lets say processor can handle load of 4 threads at a time. But we have 16 threads. Then all of them will start sharing the CPU time.
Considering 4 core CPU, 4 processes with limited threads can utilize it in better way but when thread count is high, then all threads will start sharing the CPU time. (When I say all threads will start sharing CPU time I'm not considering the priority and niceness of the process, and not even considering the other processes running on the same machine.)
My Quick search about time-slicing and CPU load sharing:
This article even answers how switching between processes can slow down the overall performance.
Worker threads are are similar in nature to threads in any other programming language.
You can have a look at this thread to understand in overall about difference between thread and process: What is the difference between a process and a thread?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With