Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the behavior of SAME padding when stride is greater than 1?

Tags:

tensorflow

My understanding of SAME padding in Tensorflow is that padding is added such that the output dimensions (for width and height) will be the same as the input dimensions. However, this understanding only really makes sense when stride=1, because if stride is >1 then output dimensions will almost certainly be lower.

So I'm wondering what the algorithm is for calculating padding in this case. Is it simply that padding is added so that the filter is applied to every input value, rather than leaving some off on the right?

like image 988
Stephen Avatar asked Jan 28 '18 21:01

Stephen


2 Answers

Peter's answer is true but might lack a few details. Let me add on top of it.

Autopadding = SAME means that: o = ceil(i/s), where o = output size, i = input size, s = stride.

In addition, the generic output size formula is:

o = floor( (i + p - k) / s)   +   1

Where the new terms are p (pading) and k, i.e., the effective kernel size (including dilation, or just kernel size if dilation is disabled).

If you develop that formula to solve for p, you get:

p_min = (o-1) s - i + k # i.e., when the floor is removed from the previous equation
p_max = o s - i + k - 1 # i.e., when the numerator of the floor % s is s-1

Any padding value p in the range [p_min, p_max] will satisfy the condition o = ceil(i/s), meaning that for a stride s there are s total solution satisfying the formula.

It is the norm to use p_min as padding, so you can ignore all other s-1 solutions.

PS: This would be for 1D, but for nD, simply repeat these formulas independently for each dimension, i.e.,

p_min[dimension_index] = (o[dimension_index]-1)s[dimension_index] - i[dimension_index] + k[dimension_index]

For references, these 2 links are really useful:

  • https://arxiv.org/abs/1603.07285
  • https://towardsdatascience.com/a-comprehensive-introduction-to-different-types-of-convolutions-in-deep-learning-669281e58215
  • https://mmuratarat.github.io/2019-01-17/implementing-padding-schemes-of-tensorflow-in-python
like image 175
Ginés Hidalgo Avatar answered Oct 13 '22 05:10

Ginés Hidalgo


There is a formula for that:

n' = floor((n+2*p-f)/s + 1)

where n' is the output size, n is the input size, p is the padding and f is the filter size, s will be the stride.

If you are using SAME padding with stride > 1, p will be the minimum number to make (n+2*p-f) divisible by s. Note: p could be decimal as it will be averaged over two sides of the image.

like image 23
PeterZhao Avatar answered Oct 13 '22 07:10

PeterZhao