Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the uses of tf.space_to_depth?

I am reading this document on tf.space_to_depth. There, it says about the use of the function:

This operation is useful for resizing the activations between convolutions (but keeping all data), e.g. instead of pooling. It is also useful for training purely convolutional models.

However, I still don't get a clear understanding of this. Why is it sometimes necessary to resize the activations in a model?

like image 955
Earthgod Avatar asked Mar 06 '18 06:03

Earthgod


1 Answers

space_to_depth is a convolutional practice used very often for lossless spatial dimensionality reduction. Applied to tensor (example_dim, width, height, channels) with block_size = k it produces a tensor with shape (example_dim, width / block_size, height / block_size, channels * block_size ** 2). It works in a following manner (example_dim is skipped for simplicity):

  1. Cut image / feature map into chunks of size (block_size, block_size, channels): e.g. the following image (with block_size = 2):

    [[[1], [2], [3], [4]],
     [[5], [6], [7], [8]],
     [[9], [10], [11], [12]],
     [[13], [14], [15], [16]]]
    

    is divided into the following chunks:

    [[[1], [2]],       [[[3], [4]],
     [[5], [6]]]        [[7], [8]]]
    
    [[[9], [10],]      [[[11], [12]],
     [[13], [14]]]      [[15], [16]]]
    
  2. Flatten each chunk to a single array:

    [[1, 2, 5, 6]],      [[3, 4, 7, 8]]
    [[9 10, 13, 14]],    [[11, 12, 15, 16]]
    
  3. Spatially rearrange chunks according to their initial position:

    [[[1, 2, 5, 6]], [[3, 4, 7, 8]],
     [[9 10, 13, 14]], [[11, 12, 15, 16]]]
    

So - as you may see - the initial image with size (4, 4, 1) was rearranged to feature map with shape (2, 2, 4). The following strategy is usually used for applications like object detection, segmentation or superresolution when it's important to decrease the spatial size of an image without losing reduction (like pooling). An example of an application of this technique might be found e.g. here.

like image 86
Marcin Możejko Avatar answered Sep 27 '22 03:09

Marcin Możejko