Is there a way to spread xz
compression efforts across multiple CPU's? I realize that this doesn't appear possible with xz
itself, but are there other utilities that implement the same compression algorithm that would allow more efficient processor utilization? I will be running this in scripts and utility apps on systems with 16+ processors and it would be useful to at least use 4-8 processors to potentially speed up compression rates.
For example, gzip and bzip2 are each specific categories. The tools mentioned here are capable of parallel (multithreaded) compression with the gzip file format. This means the serial gzip is fully capable of decompressing files compressed with the multithreaded tools.
Key Takeaways. Python is NOT a single-threaded language. Python processes typically use a single thread because of the GIL. Despite the GIL, libraries that perform computationally heavy tasks like numpy, scipy and pytorch utilise C-based implementations under the hood, allowing the use of multiple cores.
On many unix like systems, tar is a widely used tool to package and compress files, almost built-in in the all common Linux and BSD distribution, however, tar always spends a lot of time on file compression, because the programs itself doesn't support multi-thread compressing, but fortunately, tar supports to use ...
Multiprocessing enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. This parallelization leads to significant speedup in tasks that involve a lot of computation.
Multiprocessor (multithreading) compression support was added to xz
in version 5.2, in December 2014.
To enable the functionality, add the -T
option, along with either the number of worker threads to spawn, or -T0
to spawn as many CPU's as the OS reports:
xz -T0 big.tar xz -T4 bigish.tar
The default single threaded operation is equivalent to -T1
.
I have found that running it with a couple of hyper-threads less than the total number of hyperthreads on my CPU† provides a good balance of responsiveness and compression speed.
† So -T10
on my 6 core, 12 thread workstation.
As scai and Dzenly said in comments
If you want to use this in combination with
tar
just callexport XZ_DEFAULTS="-T 0"
before.or use smth like:
XZ_OPT="-2 -T0"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With