Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using a preset deflate dictionary to reduce compressed archive file size

I have a requirement where text files are send from one location to other. Both location are in our control. The nature of content and the words that could appear in this are mostly the same. Which means, if I keep the delate dictionary in both location once, there is no need to send it with file.

I have been reading about this last 1 week and experimenting with some available codes such as this & this.

However, I am still in dark.

Few questions I still have:

  1. Can we generate and use custom deflate dictionary from a preset of words?
  2. Can we send file without the deflate dictionary and use local one?
  3. If not gzip, are there any such compression library that can be used for this purpose?

Some references I stumbled upon so far:

  1. https://medium.com/iecse-hashtag/huffman-coding-compression-basics-in-python-6653cdb4c476
  2. https://blog.cloudflare.com/improving-compression-with-preset-deflate-dictionary/
  3. https://www.euccas.me/zlib/#zlib_optimize_cloudflare_dict
like image 812
esafwan Avatar asked Oct 19 '25 13:10

esafwan


1 Answers

The zlib library supports dictionaries with the zlib (not gzip) format. See deflateSetDictionary() and inflateSetDictionary().

There is nothing special about the construction of a dictionary. All it is is 32K bytes of strings that you believe will occur often in the data you are compressing. You should put the most common strings at the end of the 32K.

like image 118
Mark Adler Avatar answered Oct 22 '25 04:10

Mark Adler