Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Big O complexities of algorithms - LZW and Huffman

What are the space and time complexities, in Big O notation, for the Lempel-Ziv-Welch and Huffman compression algorithms? Google is failing me.

Thanks,

Francisco

like image 245
F. P. Avatar asked May 31 '11 15:05

F. P.


People also ask

Which is better Huffman or LZW?

Huffman coding is well-situated than LZW coding. LZW coding facilitates more compression ratio than Huffman algorithm. Huffman coding requires more execution time than the LZW. In some cases time is not important as Huffman coding can be used to obtain high compression ratio.

What is LZW algorithm?

Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978.

Is LZW lossless or lossy?

LZW TIFFs are considered a lossless file format. LZW TIFF compressions reorder the digital data into a smaller file size without deleting any pixels at all.

How does LZW achieve data compression?

LZW compression works by reading a sequence of symbols, grouping the symbols into strings, and converting the strings into codes. Because the codes take up less space than the strings they replace, we get compression.


2 Answers

As the dictionary size is fixed and independent of the input length, LZW is in O(n) as each byte is only read once and the complexity of the operation for each character is constant.

And Huffman encoding is also in O(n): First you count the number of occurrences for each input byte, then you sort it and build the output encoding.

like image 80
Gumbo Avatar answered Sep 21 '22 12:09

Gumbo


Depends on the implementation. They get better all the time. "Huffman" is a bit too common a term. For example, you could mean an explicit tree, an implicit, a dynamic one... But in any case, I guess if you do it very clever you should be able to implement alomost any "Huffman" on O(n), with n being the text-length.

LZW is also implementation-dependent. I do not know off-hand what "O" common implementations have. I guess with big tables you probably have something like O(n log n), but thats just a guess.

like image 41
towi Avatar answered Sep 19 '22 12:09

towi