I am told that Huffman coding is used as loseless data compression algorithm, but I am also told that real data compress software do not employ Huffman coding, because if the keys are not distributed decentralized enough, the compressed file could be even larger than the orignal file.
This leaves me wondering are there any real-world application of Huffman coding?
Huffman is widely used in all the mainstream compression formats that you might encounter - from GZIP, PKZIP (winzip etc) and BZIP2, to image formats such as JPEG and PNG.
The Huffman encoding scheme takes advantage of the disparity between frequencies and uses less storage for the frequently occurring characters at the expense of having to use more storage for each of the more rare characters.
Given some set of documents for instance, encoding those documents as Huffman codes is the most space efficient way of encoding them, thus saving space. This however only applies to that set of documents as the codes you end up are dependent on the probability of the tokens/symbols in the original set of documents.
Huffman is widely used in all the mainstream compression formats that you might encounter - from GZIP, PKZIP (winzip etc) and BZIP2, to image formats such as JPEG and PNG.
All compression schemes have pathological data-sets that cannot be meaningfully compressed; the archive formats I listed above simply 'store' such files uncompressed when they are encountered.
Newer arithmetic and range coding schemes are often avoided because of patent issues, meaning Huffman remains the work-horse of the compression industry.
See Wikipedia article on the subject:
Huffman coding today is often used as a "back-end" to some other compression method. DEFLATE (PKZIP's algorithm) and multimedia codecs such as JPEG and MP3 have a front-end model and quantization followed by Huffman coding.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With