I am trying to do edge detection with convolution. I think I need to normalize the image after the convolution.
I am using the convolution matrix specified here: https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution
Attached is some r code, and the source and output image...
require(jpeg)
myjpg <- readJPEG("mtg.jpg")
grayImg <- myjpg[,,1]+myjpg[,,2]+myjpg[,,3] # reduce to gray
grayImg <- grayImg/max(grayImg) # normalize
dim(grayImg)
convolve <- function(img, f){
newimg <- img
radius <- as.integer(nrow(f)/2)+1
print(radius)
for(i in c(1:nrow(img))){
for(j in c(1:ncol(img))){
f_sub <- f[c(max(1,radius-i+1):min(nrow(f),nrow(img)-i+radius)),c(max(1,radius-j+1):min(ncol(f),ncol(img)-j+radius))]
img_sub <- img[c(max(1,i-radius+1):min(nrow(img),i+radius-1)),c(max(1,j-radius+1):min(ncol(img),j+radius-1))]
wavg <- sum(as.vector(f_sub)*as.vector(img_sub))# / sum(as.vector(f_sub)) # todo, not sure about this division
newimg[i,j] <- wavg
}
}
return(newimg)
}
edgeFilter <- matrix(c(-1,-1,-1,-1,8,-1,-1,-1,-1), ncol = 3)
outimg <- convolve(grayImg,edgeFilter)
outimg <- outimg - min(outimg)
outimg <- outimg/max(outimg)
plot(c(0,1),c(0,1),t='n')
rasterImage(outimg, 0,0,1,1)
Gray image before processing:
Gray image after processing:
I am confused, because in the examples I have seen, the convolved image is black and white. Here, my convolution needs normalization, AND, it's not pure black and white.
Normalizing image inputs: Data normalization is an important step which ensures that each input parameter (pixel, in this case) has a similar data distribution. This makes convergence faster while training the network.
1.3. Image normalization ensures optimal comparisons across data acquisition methods and texture instances. The normalization of pixel values (intensity) is recommended for imaging modalities that do not correspond to absolute physical quantities.
There are some variations on how to normalize the images but most seem to use these two methods: Subtract the mean per channel calculated over all images (e.g. VGG_ILSVRC_16_layers) Subtract by pixel/channel calculated over all images (e.g. CNN_S, also see Caffe's reference network)
Image normalization is a process, often used in the preparation of data sets for artificial intelligence (AI), in which multiple images are put into a common statistical distribution in terms of size and pixel values; however, a single image can also be normalized within itself.
i think example you have seen is not normalizing the image, due to which negative values will be clipped and only positive values will be seen and due to non scaling of the gray values, positive values are showing as highest values present in the image, while you are normalizing the scale so there is no clipping and clamping of gray values happening, all depends upon visualization.
Your visualization is perfectly normal. What's happening is that you are remapping pixel values so that the lowest intensity gets set to 0 and the highest intensity gets set to 1. All of the other values get linearly remapped to conform within the [0,1]
range. The reason why you may see just black and white is due to clipping. The user who posted those results may have truncated the dynamic range where any values that are less than 0 get set to 0 and any values that are greater than 1 (or whatever the maximum value of the data type you're looking at is) is set to 1.
You are computing an edge detection where the kernel / mask has negative coefficients so it is entirely possible that you will get both negative and positive values in your result. Rescaling your image that way, you'll see that the values that are 0 get mapped to gray (around 0.5 or so) because the smallest intensities that are negative get pulled up to 0, which naturally means that your 0 values get pulled up to some non-zero number. Similarly, those values that are very large get normalized to 1.
However, when performing normalization, it is standard practice to normalize the kernel. The reason why is because by doing this, you are ensuring that the output value at each pixel never goes beyond the dynamic range of the data type. By normalizing the kernel, you are ensuring that all of the coefficients are weighted between [0,1]
and the total sum of the kernel is 1. By doing this, you are ensuring that you never have to check the output and clip the value when necessary. This also ensures that you don't need to divide by the sum of the weights in your convolution code at every pixel because that normalization has already been taken care of by the kernel normalization step. You only have to normalize once. However, it's a tricky business when you have negative coefficients in the kernel. If there are negative coefficients, then normalization is seldom done.... at least what I have seen in practice.
Now, going back to the "black and white" stuff, if you used another filter... say... an average filter, you will certainly get just a "black and white" picture because there are no values that are ever negative.... even after you normalize the output to the [0,1]
via the min-max approach. Bear in mind that this will perform a contrast-stretching where if your intensities were concentrated on a small subset of the [0,1]
range, the output will stretch so that the lowest intensity goes down to 0 and the largest intensity gets mapped to 1.
I've modified your code to do this. Bear in mind that I couldn't find the original image that you had without the axes lines, so I took a snapshot and saved it as PNG. Therefore, I used the png
package instead of the jpeg
package:
require(png) # Change
myjpg <- readPNG("mtg.png") # Change
grayImg <- myjpg[,,1]+myjpg[,,2]+myjpg[,,3] # reduce to gray
grayImg <- grayImg/max(grayImg) # normalize
dim(grayImg)
convolve <- function(img, f){
newimg <- img
radius <- as.integer(nrow(f)/2)+1
print(radius)
for(i in c(1:nrow(img))){
for(j in c(1:ncol(img))){
f_sub <- f[c(max(1,radius-i+1):min(nrow(f),nrow(img)-i+radius)),c(max(1,radius-j+1):min(ncol(f),ncol(img)-j+radius))]
img_sub <- img[c(max(1,i-radius+1):min(nrow(img),i+radius-1)),c(max(1,j-radius+1):min(ncol(img),j+radius-1))]
wavg <- sum(as.vector(f_sub)*as.vector(img_sub))# / sum(as.vector(f_sub)) # todo, not sure about this division
newimg[i,j] <- wavg
}
}
return(newimg)
}
#edgeFilter <- matrix(c(-1,-1,-1,-1,8,-1,-1,-1,-1), ncol = 3)
averageFilter <- matrix(c(1,1,1,1,1,1,1,1,1), ncol=3) / 9# Change
#outimg <- convolve(grayImg,edgeFilter) # Change
outimg <- convolve(grayImg,averageFilter)
outimg <- outimg - min(outimg)
outimg <- outimg/max(outimg)
plot(c(0,1),c(0,1),t='n')
rasterImage(outimg, 0,0,1,1)
Here's what I get:
Compare this with the original image:
If you stare at it more closely, you'll see that there's some blur between the two. You'll see more of a blur if you increase the average filter size... say... 7 x 7:
averageFilter <- matrix(rep(1,49), ncol=7) / 49
Doing this, this is the image we get:
As we expect... more blurred. However, the point is here is that the dynamic range of your data will determine how the image is visualized when you decide to normalize it via the min-max way. If there are negative values, expect that the values that are around 0 will be pushed to some non-zero value... usually gray. This happens if you specify a kernel with negative coefficients. If you have a kernel with strictly positive coefficients, you won't see any values that are negative and the visualization is as you expected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With