Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DCT Compression - Block Size, Choosing Coefficients

I'm trying to understand the effect of the Block Size and best strategy of choosing the Coefficients in DCT compression. Basically I want to ask what I wrote here:

Video Compression: What is discrete cosine transform?

Lets assume the most primitive compression. Making block of an image. Performing a DCT on each blog and zeroing out some coefficients.

To my understanding, the smaller the block the better. Smaller blocks means the Pixels are more correlated hence the energy in the DCT spectrum is more "Compact". It should be more emphasized in a fast varying images (High Frequency).

Let's say we zero out a certain percent of the coefficients, what would result in best image quality, small or large blocks? Let's say we keep, 10%, 25%, 50%, 75%, would you say it's a different answer for a different percentage?

Another issue is how to chose the coefficients you leave untouched. Lest's say I have to make a decision based on location and not energy. Would you take a square from the top left corner? I've averaged many block in the DCT spectrum and concluded the best would be taking a triangle from the top left corner. What do you think?

Hopefully we'll have effective discussion.

like image 370
Royi Avatar asked Jan 22 '23 20:01

Royi


1 Answers

The essence of your question seems to be about image quality. There has been a considerable literature produced on the subject, and the result is that image quality is a hard thing to determine.

Standard mathematical error measures like the signal-to-noise ratio (SNR) and mean-squared error (MSE) can give a quantitative answer, but it is well known that these don’t correlate well with subjective viewer opinions, which must be our final authority. No other methods, even those founded on psycho-visual models of the viewer (e.g., S.A. Karunasekera and N.G. Kingsbury, “A distortion measure for blocking artifacts in images based on human visual sensitivity”, IEEE Trans. on Image Proc. vol. 4, no. 6, June 1995, pp. 713 –724; and M. Miyahara, K. Kotani, and V. R. Algazi, “Objective picture quality scale (PQS) for image coding,” IEEE Trans. on Comm. vol. 46, no. 9, Sept. 1998, pp. 1215 –1226), have proven themselves to be better than SNR.

Moreover, when you vary the type of imagery (line drawing, cartoon, photo, portrait, etc.), certain types of compression distortion become more evident. Mosquito noise might be objectionable in one image, while staircase noise might be the culprit in another.

In short, there is no pat answer to your question, "what would result in best image quality?"

That being said, we can say some things about the DCT that are of relevance. The pixels in a DCT of a block go from low variation to high variation in a zig-zag pattern from the top left corner [(0,0)->(0,1)->(1,0)->(2,0)->(1,1)->(0,2)->etc.], as your triangle selection mirrors. The closer a pixel is to the top left corner, the smoother the information contained therein [in fact, the (0,0) DCT value is the average of the whole block], and the farther away from that corner you get, the more "high frequency" details you'll get. The closer to the top and left of the image, the more horizontal and vertical details you'll have represented by that DCT coefficient, and the closer to the diagonal of the block, the more diagonal details you'll have.

In brief, lossy compression usually entails throwing away some of the "details" that may not be perceptible to the eye. (Throwing away the "smoother" DCT values results in severe distortion.) The more DCT values you throw away, the greater your compression ratio will be, but also the greater distortion you'll induce.

As for block size, it all depends. The more variance and detail there is in a block, the more you'll lose by throwing away coefficients. Some compression algorithms adaptively use different block sizes within the same image so that high-detail regions receive more and smaller blocks and smooth regions receive fewer and larger blocks.

For algorithms that use a single block size, 8x8, 16x16, and 32x32 are common for things like JPEG and MPEG. The processing required to compress them will be smaller than an adaptive block size, but the quality will also be lower in general.

like image 75
metal Avatar answered Feb 16 '23 17:02

metal