I am using a sobel filter of size 3x3 to calculate the image derivative. Looking at some articles on the internet, it seems that kernels for sobel filter for size 5x5 and 7x7 are also common, but I am not able to find their kernel values. Could someone please let me know the kernel values for sobel filter of size 5x5 and 7x7? Also, if someone could share a method to generate the kernel values, that will be much useful. Thanks in advance.

<h3>Complete solution for arbitrary Sobel kernel sizes and angles</h3> tl;dr: skip down to section 'Examples' To add another solution, expanding on this document (it's not particularly high quality, but it shows some usable graphics and matrices starting at the bottom of page 2). <h3>Goal</h3> What we're trying to do is estimate the local gradient of the image at position (x,y). The gradient is a vector made up of the components in x and y direction, gx and gy. Now, imagine we want to approximate the gradient based on our pixel (x,y) and its neighbours as a kernel operation (3x3, 5x5, or whatever size). <h3>Solution idea</h3> We can approximate the gradient by summing over the projections of all neighbor-center pairs onto the gradient direction. (Sobel's kernel is just a particular method of weighting the different contributions, and so is Prewitt, basically). <h3>Explicit intermediate steps for 3x3</h3> This is the local image, central pixel (x,y) marked as 'o' (center) <pre class="prettyprint"><code>a b c d o f g h i </code></pre> Let's say we want the gradient in positive x direction. The unit vector in positive x-direction is (1,0) [I'll later use the convention that the positive y direction is DOWN, i.e. (0,1), and that (0,0) is top left of image).] The vector from o to f ('of' for short) is (1,0). The gradient in direction 'of' is (f - o) / 1 (value of image at pixel here denoted f minus value at center o, divided by distance between those pixels). If we project the unit vector of that particular neighbor gradient onto our desired gradient direction (1,0) via a dot product we get 1. Here is a little table with the contributions of all neighbors, starting with the easier cases. Note that for diagonals, their distance is sqrt2, and the unit vectors in the diagonal directions are 1/sqrt2 * (+/-1, +/-1) <pre class="prettyprint"><code>f: (f-o)/1 * 1 d: (d-o)/1 * -1 because (-1, 0) dot (1, 0) = -1 b: (b-o)/1 * 0 because (0, -1) dot (1, 0) = 0 h: (h-o)/1 * 0 (as per b) a: (a-o)/sqrt2 * -1/sqrt2 distance is sqrt2, and 1/sqrt2*(-1,-1) dot (1,0) = -1/sqrt2 c: (c-o)/sqrt2 * +1/sqrt2 ... g: (g-o)/sqrt2 * -1/sqrt2 ... i: (i-o)/sqrt2 * +1/sqrt2 ... </code></pre> edit for clarification: There are two factors of 1/sqrt(2) for the following reason: <ol> <li>We are interested in the contribution to the gradient in a specific direction (here x), so we need to project the directional gradient from the center pixel to the neighbor pixel onto the direction we are interested in. This is accomplished by taking the scalar product of the unit vectors in the respective directions, which introduces the first factor 1/L (here 1/sqrt(2) for the diagonals).</li> <li>The gradient measures the infinitesimal change at a point, which we approximate by finite differences. In terms of a linear equation, m = (y2-y1)/(x2-x1). For this reason, the value difference from the center pixel to the neighbor pixel (y2-y1) has to be distributed over their distance (corresponds to x2-x1) in order to get the ascent units per distance unit. This yields a second factor of 1/L (here 1/sqrt(2) for the diagonals)</li> </ol> Ok, now we know the contributions. Let's simplify this expression by combining opposing pairs of pixel contributions. I'll start with d and f: <pre class="prettyprint"><code>{(f-o)/1 * 1} + {(d-o)/1 * -1} = f - o - (d - o) = f - d </code></pre> Now the first diagonal: <pre class="prettyprint"><code>{(c-o)/sqrt2 * 1/sqrt2} + {(g-o)/sqrt2 * -1/sqrt2} = (c - o)/2 - (g - o)/2 = (c - g)/2 </code></pre> The second diagonal contributes (i - a)/2. The perpendicular direction contributes zero. Note that all contributions from the central pixel 'o' vanish. We have now calculated the contributions of all closest neighbours to the gradient in positive x-direction at pixel (x,y), so our total approximation of the gradient in x-direction is simply their sum: <pre class="prettyprint"><code>gx(x,y) = f - d + (c - g)/2 + (i - a)/2 </code></pre> We can obtain the same result by using a convolution kernel where the coefficients are written in the place of the corresponding neighbor pixel: <pre class="prettyprint"><code>-1/2 0 1/2 -1 0 1 -1/2 0 1/2 </code></pre> If you don't want to deal with fractions, you multiply this by 2 and get the well-known Sobel 3x3 kernel. <pre class="prettyprint"><code> -1 0 1 G_x = -2 0 2 -1 0 1 </code></pre> The multiplication by two only serves to get convenient integers. The scaling of your output image is basically arbitrary, most of the time you normalize it to your image range, anyway (to get clearly visible results). By the same reasoning as above, you get the kernel for the vertical gradient gy by projecting the neighbor contributions onto the unit vector in positive y direction (0,1) <pre class="prettyprint"><code> -1 -2 -1 G_y = 0 0 0 1 2 1 </code></pre> <h3>Formula for kernels of arbitrary size</h3> If you want 5x5 or larger kernels, you only need to pay attention to the distances, e.g. <pre class="prettyprint"><code>A B 2 B A B C 1 C B 2 1 - 1 2 B C 1 C B A B 2 B A </code></pre> where <pre class="prettyprint"><code>A = 2 * sqrt2 B = sqrt5 C = sqrt2. </code></pre> If the length of the vector connecting any two pixels is L, the unit vector in that direction has a prefactor of 1/L. For this reason, the contributions of any pixel 'k' to (say) the x-gradient (1,0) can be simplified to "(value difference over squared distance) times (DotProduct of unnormalized direction vector 'ok' with gradient vector, e.g. (1,0) )" <pre class="prettyprint"><code>gx_k = (k - o)/(pixel distance^2) ['ok' dot (1,0)]. </code></pre> Because the dot product of the connecting vector with the x unit vector selects the corresponding vector entry, the corresponding G_x kernel entry at position k is just <pre class="prettyprint"><code>i / (i*i + j*j) </code></pre> where i and j are the number of steps from the center pixel to the pixel k in x and y direction. In the above 3x3 calculation, the pixel 'a' would have i = -1 (1 to the left), j = -1 (1 to the top) and hence the 'a' kernel entry is -1 / (1 + 1) = -1/2. The entries for the G_y kernel are <pre class="prettyprint"><code>j/(i*i + j*j). </code></pre> If I want integer values for my kernel, I follow these steps: <ul> <li>check the available range of the output image</li> <li>compute highest possible result from applying floating point kernel (i.e. assume max input value under all positive kernel entries, so output value is (sum over all positive kernel values) * (max possible input image value). If you have signed input, you need to consider the negative values as well. Worst case is then the sum of all positive values + sum of all abs values of negative entries (if max input under positives, -max input under negatives). edit: the sum of all abs values has also been aptly called the weight of the kernel</li> <li>calculate maximum allowed up-scaling for kernel (without overflowing range of output image)</li> <li>for all integer multiples (from 2 to above maximum) of floating point kernel: check which has the lowest sum of absolute round-off errors and use this kernel</li> </ul> So in summary: <pre class="prettyprint"><code>Gx_ij = i / (i*i + j*j) Gy_ij = j / (i*i + j*j) </code></pre> where i,j is position in the kernel counted from the center. Scale kernel entries as needed to obtain integer numbers (or at least close approximations). These formulae hold for all kernel sizes. <h3>Examples</h3> <pre class="prettyprint"><code> -2/8 -1/5 0 1/5 2/8 -5 -4 0 4 5 -2/5 -1/2 0 1/2 2/5 -8 -10 0 10 8 G_x (5x5) -2/4 -1/1 0 1/1 2/4 (*20) = -10 -20 0 20 10 -2/5 -1/2 0 1/2 2/5 -8 -10 0 10 8 -2/8 -1/5 0 1/5 2/8 -5 -4 0 4 5 </code></pre> Note that the central 3x3 pixels of the 5x5 kernel in float notation are just the 3x3 kernel, i.e. larger kernels represent a continued approximation with additional but lower-weighted data. This continues on to larger kernel sizes: <pre class="prettyprint"><code> -3/18 -2/13 -1/10 0 1/10 2/13 3/18 -3/13 -2/8 -1/5 0 1/5 2/8 3/13 -3/10 -2/5 -1/2 0 1/2 2/5 3/10 G_x (7x7) -3/9 -2/4 -1/1 0 1/1 2/4 3/9 -3/10 -2/5 -1/2 0 1/2 2/5 3/10 -3/13 -2/8 -1/5 0 1/5 2/8 3/13 -3/18 -2/13 -1/10 0 1/10 2/13 3/18 </code></pre> Exact integer representations become impractical at this point. As far as I can tell (don't have access to the original paper), the "Sobel" part to this is properly weighting the contributions. The Prewitt solution can be obtained by leaving out the distance weighting and just entering i and j in the kernel as appropriate. <h3>Bonus: Sobel Kernels for arbitrary directions</h3> So we can approximate the x and y components of the image gradient (which is actually a vector, as stated in the very beginning). The gradient in any arbitrary direction alpha (measured mathematically positive, in this case clockwise since positive y is downward) can be obtained by projecting the gradient vector onto the alpha-gradient unit vector. The alpha-unit vector is (cos alpha, sin alpha). For alpha = 0° you can obtain the result for gx, for alpha = 90° you get gy. <pre class="prettyprint"><code>g_alpha = (alpha-unit vector) dot (gx, gy) = (cos a, sin a) dot (gx, gy) = cos a * gx + sin a * gy </code></pre> If you bother to write down gx and gy as sums of neighbor contributions, you realize that you can group the resulting long expression by terms that apply to the same neighbor pixel, and then rewrite this as a single convolution kernel with entries <pre class="prettyprint"><code>G_alpha_ij = (i * cos a + j * sin a)/(i*i + j*j) </code></pre> If you want the closest integer approximation, follow the steps outlined above.

Sobel filter kernel of large size

Tags:

image-processing

computer-vision

edge-detection

sobel

I am using a sobel filter of size 3x3 to calculate the image derivative. Looking at some articles on the internet, it seems that kernels for sobel filter for size 5x5 and 7x7 are also common, but I am not able to find their kernel values.

Could someone please let me know the kernel values for sobel filter of size 5x5 and 7x7? Also, if someone could share a method to generate the kernel values, that will be much useful.

Thanks in advance.

895

asked Mar 05 '12 13:03

Aarkan

1 Answers

Complete solution for arbitrary Sobel kernel sizes and angles

tl;dr: skip down to section 'Examples'

To add another solution, expanding on this document (it's not particularly high quality, but it shows some usable graphics and matrices starting at the bottom of page 2).

Goal

What we're trying to do is estimate the local gradient of the image at position (x,y). The gradient is a vector made up of the components in x and y direction, gx and gy.

Now, imagine we want to approximate the gradient based on our pixel (x,y) and its neighbours as a kernel operation (3x3, 5x5, or whatever size).

Solution idea

We can approximate the gradient by summing over the projections of all neighbor-center pairs onto the gradient direction. (Sobel's kernel is just a particular method of weighting the different contributions, and so is Prewitt, basically).

Explicit intermediate steps for 3x3

This is the local image, central pixel (x,y) marked as 'o' (center)

a b c d o f g h i

Let's say we want the gradient in positive x direction. The unit vector in positive x-direction is (1,0) [I'll later use the convention that the positive y direction is DOWN, i.e. (0,1), and that (0,0) is top left of image).]

The vector from o to f ('of' for short) is (1,0). The gradient in direction 'of' is (f - o) / 1 (value of image at pixel here denoted f minus value at center o, divided by distance between those pixels). If we project the unit vector of that particular neighbor gradient onto our desired gradient direction (1,0) via a dot product we get 1. Here is a little table with the contributions of all neighbors, starting with the easier cases. Note that for diagonals, their distance is sqrt2, and the unit vectors in the diagonal directions are 1/sqrt2 * (+/-1, +/-1)

f:   (f-o)/1     * 1 d:   (d-o)/1     * -1       because (-1, 0) dot (1, 0) = -1 b:   (b-o)/1     * 0        because (0, -1) dot (1, 0) = 0 h:   (h-o)/1     * 0        (as per b) a:   (a-o)/sqrt2 * -1/sqrt2 distance is sqrt2, and 1/sqrt2*(-1,-1) dot (1,0) = -1/sqrt2 c:   (c-o)/sqrt2 * +1/sqrt2   ... g:   (g-o)/sqrt2 * -1/sqrt2   ... i:   (i-o)/sqrt2 * +1/sqrt2   ...

edit for clarification: There are two factors of 1/sqrt(2) for the following reason:

We are interested in the contribution to the gradient in a specific direction (here x), so we need to project the directional gradient from the center pixel to the neighbor pixel onto the direction we are interested in. This is accomplished by taking the scalar product of the unit vectors in the respective directions, which introduces the first factor 1/L (here 1/sqrt(2) for the diagonals).
The gradient measures the infinitesimal change at a point, which we approximate by finite differences. In terms of a linear equation, m = (y2-y1)/(x2-x1). For this reason, the value difference from the center pixel to the neighbor pixel (y2-y1) has to be distributed over their distance (corresponds to x2-x1) in order to get the ascent units per distance unit. This yields a second factor of 1/L (here 1/sqrt(2) for the diagonals)

Ok, now we know the contributions. Let's simplify this expression by combining opposing pairs of pixel contributions. I'll start with d and f:

{(f-o)/1 * 1} + {(d-o)/1 * -1} = f - o - (d - o) = f - d

Now the first diagonal:

{(c-o)/sqrt2 * 1/sqrt2} + {(g-o)/sqrt2 * -1/sqrt2} = (c - o)/2 - (g - o)/2 = (c - g)/2

The second diagonal contributes (i - a)/2. The perpendicular direction contributes zero. Note that all contributions from the central pixel 'o' vanish.

We have now calculated the contributions of all closest neighbours to the gradient in positive x-direction at pixel (x,y), so our total approximation of the gradient in x-direction is simply their sum:

gx(x,y) = f - d + (c - g)/2 + (i - a)/2

We can obtain the same result by using a convolution kernel where the coefficients are written in the place of the corresponding neighbor pixel:

-1/2  0  1/2  -1   0   1 -1/2  0  1/2

If you don't want to deal with fractions, you multiply this by 2 and get the well-known Sobel 3x3 kernel.

      -1 0 1 G_x = -2 0 2       -1 0 1

The multiplication by two only serves to get convenient integers. The scaling of your output image is basically arbitrary, most of the time you normalize it to your image range, anyway (to get clearly visible results).

By the same reasoning as above, you get the kernel for the vertical gradient gy by projecting the neighbor contributions onto the unit vector in positive y direction (0,1)

      -1 -2 -1 G_y =  0  0  0        1  2  1

Formula for kernels of arbitrary size

If you want 5x5 or larger kernels, you only need to pay attention to the distances, e.g.

A B 2 B A B C 1 C B 2 1 - 1 2 B C 1 C B A B 2 B A

where

A = 2 * sqrt2 B = sqrt5 C = sqrt2.

If the length of the vector connecting any two pixels is L, the unit vector in that direction has a prefactor of 1/L. For this reason, the contributions of any pixel 'k' to (say) the x-gradient (1,0) can be simplified to "(value difference over squared distance) times (DotProduct of unnormalized direction vector 'ok' with gradient vector, e.g. (1,0) )"

gx_k = (k - o)/(pixel distance^2) ['ok' dot (1,0)].

Because the dot product of the connecting vector with the x unit vector selects the corresponding vector entry, the corresponding G_x kernel entry at position k is just

i / (i*i + j*j)

where i and j are the number of steps from the center pixel to the pixel k in x and y direction. In the above 3x3 calculation, the pixel 'a' would have i = -1 (1 to the left), j = -1 (1 to the top) and hence the 'a' kernel entry is -1 / (1 + 1) = -1/2.

The entries for the G_y kernel are

j/(i*i + j*j).

If I want integer values for my kernel, I follow these steps:

check the available range of the output image
compute highest possible result from applying floating point kernel (i.e. assume max input value under all positive kernel entries, so output value is (sum over all positive kernel values) * (max possible input image value). If you have signed input, you need to consider the negative values as well. Worst case is then the sum of all positive values + sum of all abs values of negative entries (if max input under positives, -max input under negatives). edit: the sum of all abs values has also been aptly called the weight of the kernel
calculate maximum allowed up-scaling for kernel (without overflowing range of output image)
for all integer multiples (from 2 to above maximum) of floating point kernel: check which has the lowest sum of absolute round-off errors and use this kernel

So in summary:

Gx_ij = i / (i*i + j*j) Gy_ij = j / (i*i + j*j)

where i,j is position in the kernel counted from the center. Scale kernel entries as needed to obtain integer numbers (or at least close approximations).

These formulae hold for all kernel sizes.

Examples

          -2/8 -1/5  0  1/5  2/8           -5  -4  0   4   5           -2/5 -1/2  0  1/2  2/5           -8 -10  0  10   8 G_x (5x5) -2/4 -1/1  0  1/1  2/4  (*20) = -10 -20  0  20  10           -2/5 -1/2  0  1/2  2/5           -8 -10  0  10   8           -2/8 -1/5  0  1/5  2/8           -5  -4  0   4   5

Note that the central 3x3 pixels of the 5x5 kernel in float notation are just the 3x3 kernel, i.e. larger kernels represent a continued approximation with additional but lower-weighted data. This continues on to larger kernel sizes:

           -3/18 -2/13 -1/10 0  1/10 2/13 3/18            -3/13 -2/8  -1/5  0  1/5  2/8  3/13            -3/10 -2/5  -1/2  0  1/2  2/5  3/10 G_x (7x7)  -3/9  -2/4  -1/1  0  1/1  2/4  3/9             -3/10 -2/5  -1/2  0  1/2  2/5  3/10            -3/13 -2/8  -1/5  0  1/5  2/8  3/13            -3/18 -2/13 -1/10 0  1/10 2/13 3/18

Exact integer representations become impractical at this point.

As far as I can tell (don't have access to the original paper), the "Sobel" part to this is properly weighting the contributions. The Prewitt solution can be obtained by leaving out the distance weighting and just entering i and j in the kernel as appropriate.

Bonus: Sobel Kernels for arbitrary directions

So we can approximate the x and y components of the image gradient (which is actually a vector, as stated in the very beginning). The gradient in any arbitrary direction alpha (measured mathematically positive, in this case clockwise since positive y is downward) can be obtained by projecting the gradient vector onto the alpha-gradient unit vector.

The alpha-unit vector is (cos alpha, sin alpha). For alpha = 0° you can obtain the result for gx, for alpha = 90° you get gy.

g_alpha = (alpha-unit vector) dot (gx, gy)         = (cos a, sin a) dot (gx, gy)         = cos a * gx + sin a * gy

If you bother to write down gx and gy as sums of neighbor contributions, you realize that you can group the resulting long expression by terms that apply to the same neighbor pixel, and then rewrite this as a single convolution kernel with entries

G_alpha_ij = (i * cos a + j * sin a)/(i*i + j*j)

If you want the closest integer approximation, follow the steps outlined above.

answered Sep 30 '22 03:09

Daniel

Related questions
                            
                                3D reconstruction -- How to create 3D model from 2D image?
                            
                                Detection of coins (and fit ellipses) on an image
                            
                                How to apply blur to a UIView?
                            
                                Finding specific pixel colors of a BitmapImage
                            
                                How do I display a single image in PyTorch?
                            
                                Converting Numpy Array to OpenCV Array
                            
                                ImageMagick vs GD - which is faster, less resource intensive and produces better images?
                            
                                How to find table like structure in image
                            
                                Camera preview image data processing with Android L and Camera2 API
                            
                                module' object has no attribute 'drawMatches' opencv python
                            
                                Counting Cars OpenCV + Python Issue
                            
                                Adding borders to an image using python
                            
                                Extracting segments from a list of 8-connected pixels
                            
                                How to manipulate images at the pixel level in C#
                            
                                Cut out image in shape of text
                            
                                TypeError: Image data can not convert to float
                            
                                How do I find an image contained within an image?
                            
                                Getting the bounding box of the recognized words using python-tesseract
                            
                                Image comparison algorithm
                            
                                Crop black edges with OpenCV

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With