Can zip and unzip operation be made-multithreaded in nodejs ?
There are a bunch of modules like yauzl, but neither uses multiple threads, and you can't start multiple threads yourself with node-cluster or something like that, because each zip file must be handled in a single thread
For example, the hard drive can only load one file at a time, and can only write one zipped file to the archive at a time, so this aspect probably cannot be multithreaded. Nevertheless, it is possible to be loading one file into memory at the same time as compressing another file in memory.
js core and allow us to create and sync threads. But that isn't possible. If we add threads to JavaScript, then we are changing the nature of the language. We cannot just add threads as a new set of classes or functions available — we'd probably need to change the language to support multithreading.
js is single-threaded because the JavaScript programming language is single-threaded.
js follows Single-Threaded with Event Loop Model inspired by JavaScript Event-based model with JavaScript callback mechanism. So, node. js is single-threaded similar to JavaScript but not purely JavaScript code which implies things that are done asynchronously like network calls, file system tasks, DNS lookup, etc.
According to Zlib documentation
Threadpool Usage: All zlib APIs, except those that are explicitly synchronous, use libuv's threadpool. This can lead to surprising effects in some applications, such as subpar performance (which can be mitigated by adjusting the pool size) and/or unrecoverable and catastrophic memory fragmentation. https://nodejs.org/api/zlib.html#zlib_threadpool_usage
According to libuv's threadpool you can change the environment variable UV_THREADPOOL_SIZE
to change the maximum size
If you instead wish to be compressing many small files at the same time you can use Worker Threads https://nodejs.org/api/worker_threads.html
On reading your question again it seems like you want multiple files. Use Worker Threads, these will not block your main thread and you can get the output back from them via promises.
Node JS uses Libuv and worker thread . Worker thread is a way to do operation in multi-threaded manner. While by using libuv (it maintains thread in thread pool) you can increase thread of default node js server. You can use both to improve node js performance for your operation.
So here is official documentation for worker thread : https://nodejs.org/api/worker_threads.html
See how you can increase thread pool in node js here : print libuv threadpool size in node js 8
Help for how to do multi-threading in node js. You will have to create below three file
index.mjs
import run from './Worker.mjs';
/**
* design your input list of zip files here and send them to `run` one file name at a time
* to zip, using a loop or something. It acts as promise.
* exmaple : run( <your_input> ).then( <your_output> );
**/
Worker.mjs
import { Worker } from 'worker_threads';
function runService(id, options) {
return new Promise((resolve, reject) => {
const worker = new Worker('./src/WorkerService.mjs', { workerData: { <your_input> } });
worker.on('message', res => resolve({ res: res, threadId: worker.threadId }));
worker.on('error', reject);
worker.on('exit', code => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
});
});
}
async function run(id, options) {
return await runService(id, options);
}
export default run;
WorkerService.mjs
import { workerData } from 'worker_threads';
// Here goes your logic for zipping a file, where as `workerData` will have <your_input>.
Let me know if it helps.
Can zip and unzip operation be made-multithreaded in nodejs?
Yes.
...and you can't start multiple threads yourself ... because each zip file must be handled in a single thread
I suspect your premise is faulty. Why exactly do you think a node process cannot start multiple threads? Here is an app I'm running which is using the very mature node.js cluster module with a parent process acting as a supervisor and two child processes doing heavily network and disk I/O bound tasks.
As you can see in the C
column, each process is running on a separate thread. This lets the master process remain responsive for command and control tasks (like spawning/reaping workers) while the worker processes are CPU or disk bound. This particular server accepts files from the network, sometimes decompresses them, and feeds them through external file processors. IOW, its a task that includes compression like you describe.
I'm not sure you'd want to use worker threads based on this snippet from the docs:
Workers (threads) are useful for performing CPU-intensive JavaScript operations. They will not help much with I/O-intensive work. Node.js’s built-in asynchronous I/O operations are more efficient than Workers can be.
To me, that description screams, "crypo!" In the past I've spawned child processes when having to perform any expensive crypo operations.
In another project I use node's child_process module and kick off a new child process each time I have a batch of files to compress. That particular service sees a list of ~400 files with names like process-me-2019.11.DD.MM
and concatenates them into a single process-me-2019-11-DD
file. It takes a while to compress so spawning a new process avoids blocking on the main thread.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With