Introduction
I'm interested in writing a function that outputs for me the next level in a Gaussian Pyramid(I eventually want to get to creating a Laplacian Pyramid) for use in image processing. (Link for reference https://en.wikipedia.org/wiki/Pyramid_(image_processing)#Gaussian_pyramid)
The Downsampling Problem
Now the easy part of this is that when you down/upsample, a 5-tap filter is convoled with the image before resizing.
However, the interesting part about making image pyramids is that you have to downsample and upsample an image by a factor of .5 or 2, depending on which direction you're going. Swift has a few ways of doing this, such as using CIAffineTransform and CILanczosTransform, however I'm wondering if there are ways to do it a bit more naively because I do not care about the quality of the resized image. For this post, I'm going to use Lenna(512x512) as an example, seen below:
If we want to downsample an image by a factor of two, we would take all of the odd numbered pixel data to form a new image. In MATLAB this is performed as follows(after the gaussian blur):
If I
is your input image and is NxM in size, with 3 colors mappings stored for P(a 512x512x3 matrix), then the decimated image by a scale of .5 is
R = I(1:2:end, 1:2:end,:)
All the new image is the previous with the odd numbered numbered columns and rows of the image. This yields the following, a 256x256 photo that is the first level of the gaussian pyramid:
does such a thing exist in swift? Is it doable in Core Image, or maybe a OpenGL custom filter?
The Upsampling Problem:
Upsampling is really only used when creating a Laplacian Pyramid. However the naive idea to do this is to do the following:
Initialize R
, a blank image context of the size you want to upsample to. In this case we will be upsampling the downsampled Lenna photo as seen above, so R
must be a 512x512 blank image.
Next, multiply the pixel values of the downsampled image, I
by 4. This can be done in swift by convolving the image with the 3x3 matrix [0,0,0;0,4,0;0,0,0]
. Then one can uniformly distribute the pixels of the image into the larger blank image, R
. This looks like:
Finally, one can employ the same 5 tap gaussian blur on this image to recover the upsampled image:
I'd like to know if it's possible to employ a similar method of upsampling in swift.
Another thing that I'm unsure of is if it really matters on the technique to resize an image for gaussian/laplacian filtering. If not, then certainly I could just use the fastest built in method than trying to make my own.
The GPUImage processing library can give you some up-sampling and possibly lead to your Laplacian Pyramid.
pod 'GPUImage'
SHARPEN UPSAMPLING:
UIImage *inputImage = [UIImage imageNamed:@"cutelady"];
GPUImagePicture *stillImageSource = [[GPUImagePicture alloc]initWithImage:inputImage];
GPUImageSharpenFilter *stillImageFilter = [[GPUImageSharpenFilter alloc] init];
[stillImageSource addTarget:stillImageFilter];
[stillImageFilter useNextFrameForImageCapture];
[stillImageSource processImage];
UIImage *currentFilteredVideoFrame = [stillImageFilter imageFromCurrentFramebuffer];
LANCZOS UPSAMPLING:
UIImage *inputImage = [UIImage imageNamed:@"cutelady"];
GPUImagePicture *stillImageSource = [[GPUImagePicture alloc] initWithImage:inputImage];
GPUImageLanczosResamplingFilter *stillImageFilter = [[GPUImageLanczosResamplingFilter alloc] init];
[stillImageSource addTarget:stillImageFilter];
[stillImageFilter useNextFrameForImageCapture];
[stillImageSource processImage];
[stillImageSource forceProcessingAtSizeRespectingAspectRatio:CGSizeMake(200, 200)];
UIImage *currentFilteredVideoFrame = [stillImageFilter imageFromCurrentFramebuffer];
cell.imageView.image = currentFilteredVideoFrame;
I have made some progress, and I pretty much consider this an answer to my question, although some things are a tad different and I don't think this method is very fast. I would love to hear from anyone to see how to make this code faster. In the below, it seems like resizing the image is taking up the most time, I get a TON of calls to the ovveride outputImage section and I have no idea why that is. Unfortunately when I do run the Laplacian Pyramid function below, it takes around 5 seconds to complete on a 275x300 photo. This is just no good, and I'm at a bit of a loss as to how to speed it up. My suspicion is that the resample filter is the culprit. However I am not well versed enough to know how to make it faster.
First, the custom filters:
This first one resizes an image by a simple rescaling. I think it's the best technique of rescaling in this case because all that is done is a replication of pixels when resized. For example, if we have the following block of pixels and perform a 2.0 scale, then the mapping looks like the following:
[ ][ ][x][ ] ----->[ ][ ][ ][ ][x][x][ ][ ]
(Thanks to Simon Gladman for the idea on this one)
public class ResampleFilter: CIFilter
{
var inputImage : CIImage?
var inputScaleX: CGFloat = 1
var inputScaleY: CGFloat = 1
let warpKernel = CIWarpKernel(string:
"kernel vec2 resample(float inputScaleX, float inputScaleY)" +
" { " +
" float y = (destCoord().y / inputScaleY); " +
" float x = (destCoord().x / inputScaleX); " +
" return vec2(x,y); " +
" } "
)
override public var outputImage: CIImage!
{
if let inputImage = inputImage,
kernel = warpKernel
{
let arguments = [inputScaleX, inputScaleY]
let extent = CGRect(origin: inputImage.extent.origin,
size: CGSize(width: inputImage.extent.width*inputScaleX,
height: inputImage.extent.height*inputScaleY))
return kernel.applyWithExtent(extent,
roiCallback:
{
(index,rect) in
let sampleX = rect.origin.x/self.inputScaleX
let sampleY = rect.origin.y/self.inputScaleY
let sampleWidth = rect.width/self.inputScaleX
let sampleHeight = rect.height/self.inputScaleY
let sampleRect = CGRect(x: sampleX, y: sampleY, width: sampleWidth, height: sampleHeight)
return sampleRect
},
inputImage : inputImage,
arguments : arguments)
}
return nil
}
}
This one is a simple difference blend.
public class DifferenceOfImages: CIFilter
{
var inputImage1 : CIImage? //Initializes input
var inputImage2 : CIImage?
var kernel = CIKernel(string: //The actual custom kernel code
"kernel vec4 Difference(__sample image1,__sample image2)" +
" { " +
" float colorR = image1.r - image2.r; " +
" float colorG = image1.g - image2.g; " +
" float colorB = image1.b - image2.b; " +
" return vec4(colorR,colorG,colorB,1); " +
" } "
)
var extentFunction: (CGRect, CGRect) -> CGRect =
{ (a: CGRect, b: CGRect) in return CGRectZero }
override public var outputImage: CIImage!
{
guard let inputImage1 = inputImage1,
inputImage2 = inputImage2,
kernel = kernel
else
{
return nil
}
//apply to whole image
let extent = extentFunction(inputImage1.extent,inputImage2.extent)
//arguments of the kernel
let arguments = [inputImage1,inputImage2]
//return the rectangle that defines the part of the image that CI needs to render rect in the output
return kernel.applyWithExtent(extent,
roiCallback:
{ (index, rect) in
return rect
},
arguments: arguments)
}
}
Now for some function definitions:
This function just performs a gaussian blur on the image, according to the same 5 tap filter as described in Burt & Adelson's paper. Not sure how to get rid of the awkward bordering pixels that seem to be extra.
public func GaussianFilter(ciImage: CIImage) -> CIImage
{
//5x5 convolution to image
let kernelValues: [CGFloat] = [
0.0025, 0.0125, 0.0200, 0.0125, 0.0025,
0.0125, 0.0625, 0.1000, 0.0625, 0.0125,
0.0200, 0.1000, 0.1600, 0.1000, 0.0200,
0.0125, 0.0625, 0.1000, 0.0625, 0.0125,
0.0025, 0.0125, 0.0200, 0.0125, 0.0025 ]
let weightMatrix = CIVector(values: kernelValues,
count: kernelValues.count)
let filter = CIFilter(name: "CIConvolution5X5",
withInputParameters: [
kCIInputImageKey: ciImage,
kCIInputWeightsKey: weightMatrix])!
let final = filter.outputImage!
let rect = CGRect(x: 0, y: 0, width: ciImage.extent.size.width, height: ciImage.extent.size.height)
return final.imageByCroppingToRect(rect)
}
This function just simplifies the use of resample. You can specify a target size of the new image. This turns out to be easier to deal with rather than setting a scale parameter IMO.
public func resampleImage(inputImage: CIImage, sizeX: CGFloat, sizeY: CGFloat) -> CIImage
{
let inputWidth : CGFloat = inputImage.extent.size.width
let inputHeight : CGFloat = inputImage.extent.size.height
let scaleX = sizeX/inputWidth
let scaleY = sizeY/inputHeight
let resamplefilter = ResampleFilter()
resamplefilter.inputImage = inputImage
resamplefilter.inputScaleX = scaleX
resamplefilter.inputScaleY = scaleY
return resamplefilter.outputImage
}
This function just simplifies the use of the difference filter. Just note that it's
imageOne - ImageTwo
.
public func Difference(imageOne:CIImage,imageTwo:CIImage) -> CIImage
{
let generalFilter = DifferenceOfImages()
generalFilter.inputImage1 = imageOne
generalFilter.inputImage2 = imageTwo
generalFilter.extentFunction = { (fore, back) in return back.union(fore)}
return generalFilter.outputImage
}
This function computes the level dimensions of each pyramid, and stores them in an array. Useful for later on.
public func LevelDimensions(image: CIImage,levels:Int) -> [[CGFloat]]
{
let inputWidth : CGFloat = image.extent.width
let inputHeight : CGFloat = image.extent.height
var levelSizes : [[CGFloat]] = [[inputWidth,inputHeight]]
for j in 1...(levels-1)
{
let temp = [floor(inputWidth/pow(2.0,CGFloat(j))),floor(inputHeight/pow(2,CGFloat(j)))]
levelSizes.append(temp)
}
return levelSizes
}
Now on to the good stuff: This one creates a Gaussian Pyramid a given number of levels.
public func GaussianPyramid(image: CIImage,levels:Int) -> [CIImage]
{
let PyrLevel = LevelDimensions(image, levels: levels)
var GauPyr : [CIImage] = [image]
var I : CIImage
var J : CIImage
for j in 1 ... levels-1
{
J = GaussianFilter(GauPyr[j-1])
I = resampleImage(J, sizeX: PyrLevel[j][0], sizeY: PyrLevel[j][1])
GauPyr.append(I)
}
return GauPyr
}
Finally, this function creates the Laplacian Pyramid with a given number of levels. Note that in both Pyramid functions, each level is stored in an Array.
public func LaplacianPyramid(image:CIImage,levels:Int) -> [CIImage]
{
let PyrLevel = LevelDimensions(image, levels:levels)
var LapPyr : [CIImage] = []
var I : CIImage
var J : CIImage
J = image
for j in 0 ... levels-2
{
let blur = GaussianFilter(J)
I = resampleImage(blur, sizeX: PyrLevel[j+1][0], sizeY: PyrLevel[j+1][1])
let diff = Difference(J,imageTwo: resampleImage(I, sizeX: PyrLevel[j][0], sizeY: PyrLevel[j][1]))
LapPyr.append(diff)
J = I
}
LapPyr.append(J)
return LapPyr
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With