Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest bzip2 decompressor?

Which implementation of bzip2 have the biggest decompression speed?

There is a http://bitbucket.org/james_taylor/seek-bzip2/src/tip/micro-bunzip.c which claims

Size and speed optimizations by Manuel Novoa III ([email protected]). More efficient reading of huffman codes, a streamlined read_bunzip() function, and various other tweaks. In (limited) tests, approximately 20% faster than bzcat on x86 and about 10% faster on arm. Note that about 2/3 of the time is spent in read_unzip() reversing the Burrows-Wheeler transformation. Much of that time is delay resulting from cache misses.

A lot of cache misses have a chance to be optimized out by some techniques, so even faster implementations are possible.

This one (seek-bzip2) have also an interesting feature of easy seeking in the input file.

My program will consume output of bzip2 and (Theoretically) can do this in parallel on different parts of file. So, parallel bzip2 implementations are considered too.

Thanks.

like image 790
osgx Avatar asked Sep 13 '10 13:09

osgx


2 Answers

There a bit http://lists.debian.org/debian-mentors/2009/02/msg00135.html of comparison. Parallel versions are considered.

A bit also there http://realworldtech.com/forums/index.cfm?action=detail&id=98883&threadid=98430&roomid=2

links are from intel cilk-parallel version of bzip2 http://software.intel.com/en-us/articles/a-parallel-bzip2/

Also, Intel's ipp-powered bzip2 is rathee good and also trys in IPP (with negative effect) to parallelize some insides of bzip2 (no parallel block decompression) with openmp (intel KMP 5). When limiting it to one or two threads, 20 MByte/s of decompressed stream is real on 2.4 core2 (ipp "v8" code)

Hope this helps.

like image 105
osgx Avatar answered Nov 15 '22 03:11

osgx


lbzip2 is a good alternative.

sudo apt install lbzip2

lbzip2 -d <archive>
like image 35
Flaviu Avatar answered Nov 15 '22 04:11

Flaviu