For image derivative computation, Sobel operator looks this way: <pre class="prettyprint"><code>[-1 0 1] [-2 0 2] [-1 0 1] </code></pre> I don't quite understand 2 things about it, 1.Why the centre pixel is 0? Can't I just use an operator like below, <pre class="prettyprint"><code>[-1 1] [-1 1] [-1 1] </code></pre> 2.Why the centre row is 2 times the other rows? I googled my questions, didn't find any answer which can convince me. Please help me.

In computer vision, there's very often no perfect, universal way of doing something. Most often, we just try an operator, see its results and check whether they fit our needs. It's true for gradient computation too: Sobel operator is one of many ways of computing an image gradient, which has proved its usefulness in many usecases. In fact, the simpler gradient operator we could think of is even simpler than the one you suggest above: <pre class="prettyprint"><code>[-1 1] </code></pre> Despite its simplicity, this operator has a first problem: when you use it, you compute the gradient between two positions and not at one position. If you apply it to 2 pixels <code>(x,y)</code> and <code>(x+1,y)</code>, have you computed the gradient at position <code>(x,y)</code> or <code>(x+1,y)</code>? In fact, what you have computed is the gradient at position <code>(x+0.5,y)</code>, and working with half pixels is not very handy. That's why we add a zero in the middle: <pre class="prettyprint"><code>[-1 0 1] </code></pre> Applying this one to pixels <code>(x-1,y)</code>, <code>(x,y)</code> and <code>(x+1,y)</code> will clearly give you a gradient for the center pixel <code>(x,y)</code>. This one can also be seen as the convolution of two <code>[-1 1]</code> filters: <code>[-1 1 0]</code> that computes the gradient at position <code>(x-0.5,y)</code>, at the left of the pixel, and <code>[0 -1 1]</code> that computes the gradient at the right of the pixel. Now this filter still has another disadvantage: it's very sensitive to noise. That's why we decide not to apply it on a single row of pixels, but on 3 rows: this allows to get an average gradient on these 3 rows, that will soften possible noise: <pre class="prettyprint"><code>[-1 0 1] [-1 0 1] [-1 0 1] </code></pre> But this one tends to average things a little too much: when applied to one specific row, we lose much of what makes the detail of this specific row. To fix that, we want to give a little more weight to the center row, which will allow us to get rid of possible noise by taking into account what happens in the previous and next rows, but still keeping the specificity of that very row. That's what gives the Sobel filter: <pre class="prettyprint"><code>[-1 0 1] [-2 0 2] [-1 0 1] </code></pre> Tampering with the coefficients can lead to other gradient operators such as the Scharr operator, which gives just a little more weight to the center row: <pre class="prettyprint"><code>[-3 0 3 ] [-10 0 10] [-3 0 3 ] </code></pre> There are also mathematical reasons to this, such as the separability of these filters... but I prefer seeing it as an experimental discovery which proved to have interesting mathematical properties, as experiment is in my opinion at the heart of computer vision. Only your imagination is the limit to create new ones, as long as it fits your needs...

Why Sobel operator looks that way?

Tags:

image-processing

computer-vision

edge-detection

For image derivative computation, Sobel operator looks this way:

[-1 0 1] [-2 0 2] [-1 0 1]

I don't quite understand 2 things about it,

1.Why the centre pixel is 0? Can't I just use an operator like below,

[-1 1] [-1 1] [-1 1]

2.Why the centre row is 2 times the other rows?

I googled my questions, didn't find any answer which can convince me. Please help me.

725

asked Jun 13 '13 02:06

Alcott

2 Answers

In computer vision, there's very often no perfect, universal way of doing something. Most often, we just try an operator, see its results and check whether they fit our needs. It's true for gradient computation too: Sobel operator is one of many ways of computing an image gradient, which has proved its usefulness in many usecases.

In fact, the simpler gradient operator we could think of is even simpler than the one you suggest above:

[-1 1]

Despite its simplicity, this operator has a first problem: when you use it, you compute the gradient between two positions and not at one position. If you apply it to 2 pixels (x,y) and (x+1,y), have you computed the gradient at position (x,y) or (x+1,y)? In fact, what you have computed is the gradient at position (x+0.5,y), and working with half pixels is not very handy. That's why we add a zero in the middle:

[-1 0 1]

Applying this one to pixels (x-1,y), (x,y) and (x+1,y) will clearly give you a gradient for the center pixel (x,y).

This one can also be seen as the convolution of two [-1 1] filters: [-1 1 0] that computes the gradient at position (x-0.5,y), at the left of the pixel, and [0 -1 1] that computes the gradient at the right of the pixel.

Now this filter still has another disadvantage: it's very sensitive to noise. That's why we decide not to apply it on a single row of pixels, but on 3 rows: this allows to get an average gradient on these 3 rows, that will soften possible noise:

[-1 0 1] [-1 0 1] [-1 0 1]

But this one tends to average things a little too much: when applied to one specific row, we lose much of what makes the detail of this specific row. To fix that, we want to give a little more weight to the center row, which will allow us to get rid of possible noise by taking into account what happens in the previous and next rows, but still keeping the specificity of that very row. That's what gives the Sobel filter:

[-1 0 1] [-2 0 2] [-1 0 1]

Tampering with the coefficients can lead to other gradient operators such as the Scharr operator, which gives just a little more weight to the center row:

[-3  0 3 ] [-10 0 10] [-3  0 3 ]

There are also mathematical reasons to this, such as the separability of these filters... but I prefer seeing it as an experimental discovery which proved to have interesting mathematical properties, as experiment is in my opinion at the heart of computer vision. Only your imagination is the limit to create new ones, as long as it fits your needs...

109

answered Nov 13 '22 20:11

mbrenon

EDIT The true reason that the Sobel operator looks that way can be be found by reading an interesting article by Sobel himself. My quick reading of this article indicates Sobel's idea was to get an improved estimate of the gradient by averaging the horizontal, vertical and diagonal central differences. Now when you break the gradient into vertical and horizontal components, the diagonal central differences are included in both, while the vertical and horizontal central differences are only included in one. Two avoid double counting the diagonals should therefore have half the weights of the vertical and horizontal. The actual weights of 1 and 2 are just convenient for fixed point arithmetic (and actually include a scale factor of 16).

I agree with @mbrenon mostly, but there are a couple points too hard to make in a comment.

Firstly in computer vision, the "Most often, we just try an operator" approach just wastes time and gives poor results compared to what might have been achieved. (That said, I like to experiment too.)

It is true that a good reason to use [-1 0 1] is that it centres the derivative estimate at the pixel. But another good reason is that it is the central difference formula, and you can prove mathematically that it gives a lower error in its estmate of the true derivate than [-1 1].

[1 2 1] is used to filter noise as mbrenon, said. The reason these particular numbers work well is that they are an approximation of a Gaussian which is the only filter that does not introduce artifacts (although from Sobel's article, this seems to be coincidence). Now if you want to reduce noise and you are finding a horizontal derivative you want to filter in the vertical direction so as to least affect the derivate estimate. Convolving transpose([1 2 1]) with [-1 0 1] we get the Sobel operator. i.e.:

[1]            [-1 0 1] [2]*[-1 0 1] = [-2 0 2] [1]            [-1 0 1]

answered Nov 13 '22 20:11

Bull

Related questions
                            
                                Impulse, gaussian and salt and pepper noise with OpenCV
                            
                                How to find one image inside of another?
                            
                                Removing Duplicate Images [closed]
                            
                                What is the idea behind scaling an image using Lanczos?
                            
                                How to read a raw image using PIL?
                            
                                Rotate bitmap by real angle
                            
                                open source image processing lib in java [closed]
                            
                                Fill the holes in OpenCV [duplicate]
                            
                                How to allow Chrome to access my camera on localhost?
                            
                                Android: fast bitmap blur?
                            
                                Preprocessing image for Tesseract OCR with OpenCV
                            
                                Is there an efficient algorithm for segmentation of handwritten text?
                            
                                Downsample array in Python
                            
                                OpenCV template matching and transparency
                            
                                Detection of Blur in Images/Video sequences
                            
                                A guide to convert_imageset.cpp
                            
                                How to use ScanLine property for 24-bit bitmaps?
                            
                                What are some methods to analyze image brightness using Python?
                            
                                OpenCV C++/Obj-C: Advanced square detection
                            
                                How to combine multiple PNGs into one big PNG file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With