How do the different compression levels of gzip differ?

Question

I am trying to better understand how different compression levels (1-9) of gzip differ in the way that encoding is implemented.

I've looked the zlib C source code and it seems that it has to do with how exhaustive the search for the longest matching string is, but looking for more specific information.

For example, do the levels yield any differences in the assignment of Huffman codes?

Mark Adler · Accepted Answer

The levels differ only in how hard deflate looks for matching strings, as you observed. The Huffman coding is done on a chosen fixed number of symbols (literals and length/distance pairs), producing a "block", where that number is defined by the memory level, not the compression level. The Huffman codes generated will necessarily differ, since the symbols being coded will differ.

The choice of memory level also has some effect on compression, as a larger number of symbols spreads the cost of the code description for a block over more symbols, but too many symbols may prevent adaptation of the Huffman codes to local changes in the statistics of the symbols. The default memory level is 8 (resulting in 16,383 symbols per block), since testing indicated that that gave better compression than level 9 (32,767 symbols per block). However your mileage may vary.

How do the different compression levels of gzip differ?

Tags:

compression

zip

gzip

zlib

glupyan

1 Answers

Mark Adler

Recent Activity

Donate For Us

How do the different compression levels of gzip differ?

Tags:

compression

zip

gzip

zlib

glupyan

1 Answers

Mark Adler

Related questions

Recent Activity

Donate For Us