Im trying to understand downscaling. I can see how interpolation algorithms such as bicubic and nearest neighbour can be used when when upscaling, to "fill in the blanks" between the old, known points (pixels, in case of images).
But downscaling? I cant see how any interpolation technique can be used there. There are no blanks to fill!
Ive been stuck with this for far to long, give me a nudge in the right direction. How do you interpolate when you, in fact, remove known data?
Edit: Lets assume we have a one dimensional image, with one colour channel per point. A downscale algorithm scaling 6 to 3 points by average pixel value looks like this: 1,2,3,4,5,6 = (1+2)/2,(3+4)/2,(5+6)/2 Am I on the right track here? Is this interpolation in downscaling rather than just discarding data?
Lanczos is one of several practical variants of sinc that tries to improve on just truncating it and is probably the best default choice for scaling down still images.
Some of the common interpolation algorithms are the nearest neighbor, bilinear, and bicubic [2]. Other interpolation algorithms such as Catmull-Rom and the Mitchell-Netravali generates better image quality.
The downscaling layer is a reverse version of sub-pixel convolution (also called pixel shuffle layer) [26], so that the feature channels are properly aligned and the number of channels is reduced by a factor of ×4.
BICUBIC INTERPOLATION Bicubic produces noticeably sharper images than the previous two methods, and is perhaps the ideal combination of processing time and output quality. For this reason it is a standard in many image editing programs (including Adobe Photoshop), printer drivers and in-camera interpolation.
If one conceptualizes an original pixel as having a width n, then the center of the pixel is n/2 from either edge.
One may assume that this point, in the center of the pixel defines the color.
If you are downsampling, you can think about it this way conceptually: even though you are reducing the physical size, instead think that you are maintaining the same dimensions, but reducing the number of pixels (which are increasing in size - conceptually). Then one can do the math...
Example: say your image is 1 pixel high and 3 pixels wide, and you are only going to downscale horizontally. Lets say you are going to change this to 2 pixels wide. Now the original image is 3n, and you are turning it to 2 pixels, so therefore each new pixel will take up (3/2) of an original image pixel.
Not think about the centers again... the new pixels' centers are at (3/4)n and at (9/4)n [which is (3/4) + (3/2)]. The original pixels' centers were at (1/2)n, (3/2)n, and (5/2)n. Thus each center is somewhere between where we would find the original pixel's centers - none match up with the original pixels' centers. Let's look at the first pixel at (3/4)n - it is (1/4)n away from the original first pixel, and (3/4)n away from the original second pixel.
If we want to maintain a smooth image, use the inverse relationship: take (3/4) of the color values of the first pixel + (1/4) of the color values of the second, since the new pixel center, conceptually, will be closer to the first original pixel center (n/4 away) than it will be to the second (3n/4 away).
Thus one does not have to truly discard data - one just calculates the appropriate ratios from its neighbors (in a conceptual space where physical size of the total image is not changing). It is an averaging rather than a strict skipping/discarding.
In a 2d image the ratios are more complicated to calculate, but the gist is the same. Interpolate, and pull more of the value from the closest original "neighbors". The resultant image should look quite similar to the original provided the downsample is not terribly severe.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With