Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow dilation behave differently than morphological dilation

As the following piece of code shows, the tensorflow tf.nn.dilation2D function doesn't behave as a conventional dilation operator.

import tensorflow as tf
tf.InteractiveSession()
A = [[0, 0, 0, 0, 0, 0, 0],
     [0, 0, 0, 0, 1, 0, 0],
     [0, 0, 0, 1, 1, 1, 0],
     [0, 0, 0, 0, 1, 0, 0],
     [0, 0, 0, 0, 0, 0, 0],
     [0, 0, 0, 0, 0, 0, 0]]
kernel = tf.ones((3,3,1))
input4D = tf.cast(tf.expand_dims(tf.expand_dims(A, -1), 0), tf.float32)
output4D = tf.nn.dilation2d(input4D, filter=kernel, strides=(1,1,1,1), rates=(1,1,1,1), padding="SAME")
print(tf.cast(output4D[0,:,:,0], tf.int32).eval())

Returns the following tensor:

array([[1, 1, 1, 2, 2, 2, 1],
       [1, 1, 2, 2, 2, 2, 2],
       [1, 1, 2, 2, 2, 2, 2],
       [1, 1, 2, 2, 2, 2, 2],
       [1, 1, 1, 2, 2, 2, 1],
       [1, 1, 1, 1, 1, 1, 1]], dtype=int32)

I don't understand neither why it behaves like that, neither how I should use tf.nn.dilation2d to retrieve the expected output:

array([[0, 0, 0, 1, 1, 1, 0],
       [0, 0, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1],
       [0, 0, 0, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0, 0]], dtype=int32)

Can someone enlighten the succinct documentation of tensorflow and give an explanation of what the the tf.nn.dilation2D function does ?

like image 856
Jav Avatar asked Feb 14 '19 09:02

Jav


People also ask

What is the difference between erosion and dilation in Image Processing?

Dilation adds pixels to the boundaries of objects in an image, while erosion removes pixels on object boundaries. The number of pixels added or removed from the objects in an image depends on the size and shape of the structuring element used to process the image.

What is morphological transformation in image processing?

Morphological Transformations are simple operations based on the shape of an image usually performed on a binary image. It takes our input image and a structuring element(kernel) which decides the nature of the operation.

What is morphological transformation in opencv?

Morphological transformations are some simple operations based on the image shape. It is normally performed on binary images. It needs two inputs, one is our original image, second one is called structuring element or kernel which decides the nature of operation.

What is binary dilation?

The binary dilation of an image by a structuring element is the locus of the points covered by the structuring element, when its center lies within the non-zero points of the image.


1 Answers

As mentioned in the documentation page linked,

Computes the grayscale dilation of 4-D input and 3-D filter tensors.

and

In detail, the grayscale morphological 2-D dilation is the max-sum correlation [...]

What this means is that the kernel's values are added to the image's values at each position, then the maximum value is taken as the output value.

Compare this to correlation, replacing the multiplication with an addition, and the integral (or sum) with the maximum:

      convolution: g(t) = ∫ f(𝜏) h(𝜏-t) d𝜏

      dilation: g(t) = max𝜏 { f(𝜏) + h(𝜏-t) }

Or in the discrete world:

      convolution: g[n] = ∑kf[k] h[k-n]

      dilation: g[n] = maxk { f[k] + h[k-n] }


The dilation with a binary structuring element (kernel, what the question refers to as a “conventional dilation”) uses a structuring element (kernel) that contains only 1s and 0s. These indicate “included” and “excluded”. That is, the 1s determine the domain of the structuring element.

To recreate the same behavior with a grey-value dilation, set the “included” pixels to 0 and the “excluded” pixels to minus infinity.

For example, the 3x3 square structuring element used in the question should be a 3x3 matrix of zeros.

like image 177
Cris Luengo Avatar answered Oct 27 '22 20:10

Cris Luengo