Why has TensorFlow chosen to prefer padding on the bottom right?
With SAME
padding, to me it would feel logical to start the kernel's center anchor at the first real pixel. Due to the use of asymmetric padding, this results in a discrepancy with some other frameworks. I do understand that asymmetric padding in principle is good because otherwise one would be left with an unused padding row/column.
If TensorFlow would have given presedence to padding on the left and top, it would do convolutions and weights the same as Caffe/cudnn/$frameworks
, and weight conversion would be compatible regardless of padding.
Code:
import numpy as np
import tensorflow as tf
import torch
import torch.nn as nn
tf.enable_eager_execution()
def conv1d_tf(data, kernel_weights, stride):
filters = np.reshape(kernel_weights, [len(kernel_weights), 1, 1])
out = tf.nn.conv1d(
value=data,
filters=filters,
stride=stride,
padding='SAME',
data_format='NCW',
)
return out
def conv1d_pytorch(data, kernel_weights, stride):
filters = np.reshape(kernel_weights, [1, 1, len(kernel_weights)])
kernel_size = len(kernel_weights)
size = data.shape[-1]
def same_padding(size, kernel_size, stride, dilation):
padding = ((size - 1) * (stride - 1) + dilation * (kernel_size - 1)) //2
return padding
padding = same_padding(size=size, kernel_size=kernel_size, stride=stride, dilation=0)
conv = nn.Conv1d(
in_channels=1,
out_channels=1,
kernel_size=kernel_size,
stride=stride,
bias=False,
padding=padding,
)
conv.weight = torch.nn.Parameter(torch.from_numpy(filters))
return conv(torch.from_numpy(data))
data = np.array([[[1, 2, 3, 4]]], dtype=np.float32)
kernel_weights = np.array([0, 1], dtype=np.float32)
stride = 2
out_tf = conv1d_tf(data=data, kernel_weights=kernel_weights, stride=stride)
out_pytorch = conv1d_pytorch(data=data, kernel_weights=kernel_weights, stride=stride)
print('TensorFlow: %s' % out_tf)
print('pyTorch: %s' % out_pytorch)
Output:
TensorFlow: tf.Tensor([[[2. 4.]]], shape=(1, 1, 2), dtype=float32)
pyTorch: tensor([[[1., 3.]]], grad_fn=<SqueezeBackward1>)
This is for historical compatibility reasons with previous (non-public) frameworks. It is unfortunate that the definitions aren't clearer, since it's a common stumbling block when porting between different libraries.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With