I have some ideas I would like to experiment with relating to data compression, but am finding it difficult to decipher some parts of how the standard are applied "in real life". I would like to look at some sample compressed files to observe how the the blocks are arranged and the huffman tree(s) are structured.
Are there any tools in existence which can help visualize this for a given compressed file (zip/gzip/deflate etc)? I'm picturing something like a tree view or some form of graph visualizer.
importgzipcompressed=open('alice.txt.gz','rb')gzip_file=gzip. GzipFile(fileobj=compressed) In the above code, we save the open file object as compressedbefore giving it over to the GzipFile. That way, as we read the decompressed data out of gzip_file, we’ll be able to use the tell()method to see how far we are through the compressed file.
So it makes sense that whatever I visualize should include the position in the file along the X axis, and the compressed size along the Y axis. An uncompressed file would simply be a diagonal line.
There may be readily available tools for visualizing this, but I didn’t find anything. Since I know gzip is implemented in the Python standard libraries, and I’m familiar with Python plotting libraries, I thought I would try to make my own visualization. This blog post (which is in fact just a Jupyter notebook) is the result.
The piecewise nature of the plot is just due to how obviously different gzip performs on the two different types of data. The first segment is text data, and so the slope is not very steep. The second segment is random data, which does not compress well, and so the slope is steep. This alternates for all the segments.
You might be interested in this (if you are still interested that is :-P)
http://jvns.ca/blog/2013/10/24/day-16-gzip-plus-poetry-equals-awesome/
I made a "entropy image" tool.
The entropy_image tool replaces each pixel with the (estimated) number of bits necessary to encode that pixel using range coding or Huffman compression.
I hope this isn't the only compression visualization tool in the world.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With