Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCV's Sobel filter - why does it look so bad, especially compared to Gimp?

Tags:

opencv

gimp

sobel

I'm trying to rebuild some preprocessing I have done before in Gimp, using OpenCV. The first stage is a Sobel filter for edge detection. It works very well in Gimp:

enter image description here

Now here is my attempt with OpenCV:

opencv_imgproc.Sobel(/* src = */ scaled, /* dst = */ sobel,
  /* ddepth = */ opencv_core.CV_32F,
  /* dx = */ 1, /* dy = */ 1, /* ksize = */ 5, /* scale = */ 0.25,
  /* delta = */ 0.0, /* borderType = */ opencv_core.BORDER_REPLICATE)

It looks very bad, basically highlighting points instead of contours:

enter image description here

So what am I doing wrong, or how does Gimp achieve such a good result and how can I replicate it in OpenCV?

like image 978
0__ Avatar asked Nov 05 '16 23:11

0__


1 Answers

Info

Image used from https://www.pexels.com/photo/brown-wooden-flooring-hallway-176162/ ("Free for personal and commercial use").

Solution TL;DR

Edge detection via the Sobel filter requires two separate filter operations. It cannot be done in a single step. The result of the two separate steps has to be combined to form the final result of the edge detection.

Info: I'm using float images (CV_32F) for simplicity.

Solution in code:

// Load example image
std::string path = "C:\\Temp\\SobelTest\\Lobby2\\";
std::string filename = "pexels-photo-176162 scaled down.jpeg";
std::string fqn = path + filename;
cv::Mat img = cv::imread(fqn, CV_LOAD_IMAGE_COLOR); // Value range: 0 - 255

// Convert to float and adapt value range (for simplicity)
img.convertTo(img, CV_32F, 1.f/255); // Value range: 0.0 - 1.0

// Build data for 3x3 vertical Sobel kernel
float sobelKernelHorizontalData[3][3] = 
{
    {-1, 0, 1}, 
    {-2, 0, 2}, 
    {-1, 0, 1}
};
// Calculate normalization divisor/factor
float sobelKernelNormalizationDivisor = 4.f;
float sobelKernelNormalizationFactor = 1.f / sobelKernelNormalizationDivisor;

// Generate cv::Mat for vertical filter kernel
cv::Mat sobelKernelHorizontal = 
    cv::Mat(3,3, CV_32F, sobelKernelHorizontalData); // Value range of filter result (if it is used for filtering): 0 - 4*255 or 0.0 - 4.0
// Apply filter kernel normalization
sobelKernelHorizontal *= sobelKernelNormalizationFactor; // Value range of filter result (if it is used for filtering): 0 - 255 or 0.0 - 1.0

// Generate cv::Mat for horizontal filter kernel
cv::Mat sobelKernelVertical;
cv::transpose(sobelKernelHorizontal, sobelKernelVertical);

// Apply two distinct Sobel filtering steps
cv::Mat imgFilterResultVertical;
cv::Mat imgFilterResultHorizontal;
cv::filter2D(img, imgFilterResultVertical, CV_32F, sobelKernelVertical);
cv::filter2D(img, imgFilterResultHorizontal, CV_32F, sobelKernelHorizontal);

// Build overall filter result by combining the previous results
cv::Mat imgFilterResultMagnitude;
cv::magnitude(imgFilterResultVertical, imgFilterResultHorizontal, imgFilterResultMagnitude);

// Write images to HDD. Important: convert back to uchar, otherwise we get black images
std::string filenameFilterResultVertical = path + "imgFilterResultVertical" + ".jpeg";
std::string filenameFilterResultHorizontal = path + "imgFilterResultHorizontal" + ".jpeg";
std::string filenameFilterResultMagnitude = path + "imgFilterResultMagnitude" + ".jpeg";
cv::Mat imgFilterResultVerticalUchar;
cv::Mat imgFilterResultHorizontalUchar;
cv::Mat imgFilterResultMagnitudeUchar;
imgFilterResultVertical.convertTo(imgFilterResultVerticalUchar, CV_8UC3, 255);
imgFilterResultHorizontal.convertTo(imgFilterResultHorizontalUchar, CV_8UC3, 255);
imgFilterResultMagnitude.convertTo(imgFilterResultMagnitudeUchar, CV_8UC3, 255);

cv::imwrite(filenameFilterResultVertical, imgFilterResultVerticalUchar);
cv::imwrite(filenameFilterResultHorizontal, imgFilterResultHorizontalUchar);
cv::imwrite(filenameFilterResultMagnitude, imgFilterResultMagnitudeUchar);

// Show images
cv::imshow("img", img);
cv::imshow("imgFilterResultVertical", imgFilterResultVertical);
cv::imshow("imgFilterResultHorizontal", imgFilterResultHorizontal);
cv::imshow("imgFilterResultMagnitude", imgFilterResultMagnitude);
cv::waitKey();

Note that this code is equivalent to this:

 cv::Sobel(img, imgFilterResultVertical, CV_32F, 1, 0, 3, sobelKernelNormalizationFactor);
 cv::Sobel(img, imgFilterResultHorizontal, CV_32F, 0, 1, 3, sobelKernelNormalizationFactor);
 cv::magnitude(imgFilterResultVertical, imgFilterResultHorizontal, imgFilterResultMagnitude);

Result images

Source image, vertical filter result, horizontal filter result, combined filter result (magnitude)

source imagevertical filter resulthorizontal filter resultcombined filter result (magnitude)

Short infos on OpenCV's data types and value ranges

  • Working with float images (image type CV_32F) is often very useful and sometimes simpler. However, working with float images is also slower since 4 times the data is used (compared to uchar). So if you want correctness as well as high performance, you will have to use uchar images only and always pass the correct divisors (parameter "alpha") to OpenCV functions. However, this is more error prone and it could happen that your values will overflow without you even realizing it.
  • 8-bit images (uchar, CV_8UC) have a value range of 0 - 255. 32-bit float images (CV_32F) have a value range of 0.0 - 1.0 (values larger than 1.0 will be displayed the same as 1.0). Using 32-bit images is often easier since overflow is less likely to happen (however bad scaling, e.g. values above 1.0 can happen).

Calculating the kernel normalization divisor

The normalization divisor for kernels can be calculated by the following fomula:

f = max(abs(sumNegative), abs(sumPositive))

where sumNegative is the sum of negative values in the kernel and sumPositive the sum of positive values in the kernel.

WARNING: this is not equal to float normalizationDivisor = cv::sum(cv::abs(kernel))(0), you will have to write a custom function for this.

Additional tips

  • Edge detection is resolution dependent as well as edge thickness dependent. If the edges you want to detect are rather thick, you can use larger Sobel filter kernel sizes (see Sobel filter kernel of large size , however do not use the accepted answer. Instead use Adam Bowen's answer which is (most likely) the correct one). Of course, you can also scale down your image and use the default 3x3 Sobel filter do detect thick edges.
  • Using larger filter kernels results in different normalization divisors / factors.
  • The Sobel filter is only a rough approximation regarding neighborhood distances. The Scharr filter represents an improvement over the Sobel filter as it "improves rotational invariance" [http://johncostella.com/edgedetect/ ]
  • To save colored float images, you have to convert (and scale) them back to uchar using convertTo

Edge detection on color images

It generally makes no sense to apply edge detection filters on color images. Having the image display which color channel (B, G, R) contributes how much to the edge detection and "encoding" this result into a colored pixel is a very specific and uncommon procedure. Of course if your goal is simply to make the image look "cool", then go ahead. In this case most rules won't apply anyway.

Update 2018-04-24

After rethinking repeatedly what I have written and working with image filtering over the years I have to admit: there are very valid and important reasons where edge detection on color images is useful.

Simply put: you'd want edge detection on color images if there are edges in the images which would not be visible in the gray image. Obviously this would be the case the edge between (two) differently colored areas where the colors are fairly distinguishable while their gray value would be (roughly) the same. This can happen non-intuitively because as humans we're used to seeing in color. If your application wants to be robust in such use cases you should prefer using color instead of gray images for edge detection.

Since the filtering step on the color image results in a 3-channel edge image the result has to be sensibly transformed into a single representative edge image.

This transformation step can be done in various ways: - Simple averaging - Calculating by weighting the same way as weighting B-, G-, and R-channels (0.11, 0.59, 0.30) when manually calculating the brightness of an image (which would result in an edge image already very close to the human perception) - Calculating by weighting where the humanly perceived contrast between the respective colors (there might be some LAB based approach to this out there...) - Using the maximum value for each pixel from the 3 channels - etc.

It depends on what exactly you want to achieve and how much work you want to put into this. Generally the averaging or RGB-/BGR-based weighting will suffice.

like image 180
Baiz Avatar answered Oct 13 '22 01:10

Baiz