Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the "Law of the Eight"?

While studying this document on the Evolution of JPEG, i came across "The law of the eight" in Section 7.3 of the above document.

Despite the introduction of other block sizes from 1 to 16 with the SmartScale extension, beyond the fixed size 8 in the original JPEG standard, the fact remains that the block size of 8 will still be the default value, and that all other-size DCTs are scaled in reference to the standard 8x8 DCT.

The “Law of the Eight” explains, why the size 8 is the right default and reference value for the DCT size.

My question is

What exactly is this "law of the eight" ?

  • Historically, was a study performed that evaluated numerous images from a sample to arrive at the conclusion that 8x8 image block contains enough redundant data to support compression techniques using DCT? With very large image sizes like 8M(4Kx4K) fast becoming the norm in most digital images/videos, is this assumption still valid?

  • Another historic reason to limit the macro-block to 8x8 would have been the computationally prohibitive image-data size for larger macro-blocks. With modern super-scalar architectures (eg. CUDA) that restriction no longer applies.

Earlier similar questions exist - 1, 2 and 3. But none of them bother about any details/links/references to this mysterious fundamental "law of the eight".


1. References/excerpts/details of the original study will be highly appreciated as i would like to repeat it with a modern data-set with very large sized images to test the validity of 8x8 macro blocks being optimal.

2. In case a similar study has been recently carried-out, references to it are welcome too.

3. I do understand that SmartScale is controversial. Without any clear potential benefits 1, at best it is comparable with other backward-compliant extensions of the jpeg standard 2. My goal is to understand whether the original reasons behind choosing 8x8 as the DCT block-size (in jpeg image compression standard) are still relevant, hence i need to know what the law of the eight is.

like image 527
TheCodeArtist Avatar asked Aug 16 '13 06:08

TheCodeArtist


1 Answers

My understanding is, the Law of the Eight is simply a humorous reference to the fact that the Baseline JPEG algorithm prescribed 8x8 as its only block size.

P.S. In other words, "the Law of the Eight" is a way to explain why "all other-size DCTs are scaled in reference to 8x8 DCT" by bringing in the historical perspective -- the lack of support for any other size in the original standard and its defacto implementations.

The next question to ask: why Eight? (Note that despite being a valid question, this is not the subject of the present discussion, which would still be relevant even if another value was picked historically, e.g. "Law of the Ten" or "Law of the Thirty Two".) The answer to that one is: because computational complexity of the problem grows as O(N^2) (unless FCT-class algorithms are employed, which grow slower as O(N log N) but are harder to implement on primitive hardware of embedded platforms, hence limited applicability), so larger block sizes quickly become impractical. Which is why 8x8 was chosen, as small enough to be practical on wide range of platforms but large enough to allow for not-too-coarse control of quantization levels for different frequencies.

Since the standard has clearly scratched an itch, a whole ecosphere soon grew around it, including implementations optimized for 8x8 as their sole supported block size. Once the ecosphere was in place, it became impossible to change the block size without breaking existing implementations. As that was highly undesirable, any tweaks to DCT/quantization parameters had to remain compatible with 8x8-only decoders. I believe this consideration must be what's referred to as the "Law of the Eight".

While not being an expert, I don't see how larger block sizes can help. First, dynamic range of values in one block will increase on average, requiring more bits to represent them. Second, relative quantization of frequencies ranging from "all" (represented by the block) to "pixel" has to stay the same (it is dictated by human perception bias after all), the quantization will get a bit smoother, that's all, and for the same compression level the potential quality increase will likely be unnoticeable.

like image 125
18 revs Avatar answered Nov 04 '22 10:11

18 revs