I was reading this blog post in Hackernoon about how Tensorflow's
tf.image.resize_area()
function is not reflection equivariant. So if I was going to resize an image in some data augmentation step, that this might really mess up the model training.
The author goes on to say that users should not use any of the tf.image.resize
functions, because of potentially unpredictable behavior. The article is from Jan 2018, so not that long ago. I actually checked the article's comments section, and no one has mentioned that the problems were fixed.
I was just wondering if these problems are still true and what the workaround is? Any changes in subsequent versions of tensorflow
. Like can I use tf.keras
augmentation functions instead to avoid these problems?
After I originally read the Hackernoon article that you've referenced, I also came across this article which provides a nice summary of the different implementations of bilinear interpolation across OpenCV, TF 1.X and some other DL frameworks.
I couldn't find anything on this in the TF 2.0 docs, so I have reproduced the example given in that article to test the bilinear interpolation in 2.0. When I run the following code using TensorFlow 2.0, the test passes, so it looks like moving to TF2.0 will provide you with an implementation of bilinear interpolation that matches the OpenCV implementation (and therefore addresses the issues raised in the Hackernoon article):
def test_tf2_resample_upsample_matches_opencv_methodology():
"""
According to the article below, the Tensorflow 1.x implementation of bilinear interpolation for resizing images did
not reproduce the pixel-area-based approach adopted by OpenCV. The `align_corners` option was set to False by
default due to some questionable legacy reasons but users were advised to set it to True in order to get a
'reasonable' output: https://jricheimer.github.io/tensorflow/2019/02/11/resize-confusion/
This appears to have been fixed in TF 2.0 and this test confirms that we get the results one would expect from a
pixel-area-based technique.
We start with an input array whose values are equivalent to their column indices:
input_arr = np.array([
[[0], [1], [2], [3], [4], [5]],
[[0], [1], [2], [3], [4], [5]],
])
And then resize this (holding the rows dimension constant in size, but increasing the column dimnesion to 12) to
reproduce the OpenCV example from the article. We expect this to produce the following output:
expected_output = np.array([
[[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
[[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
])
"""
input_tensor = tf.convert_to_tensor(
np.array([
[[0], [1], [2], [3], [4], [5]],
[[0], [1], [2], [3], [4], [5]],
]),
dtype=tf.float32,
)
output_arr = tf.image.resize(
images=input_tensor,
size=(2,12),
method=tf.image.ResizeMethod.BILINEAR).numpy()
expected_output = np.array([
[[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
[[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
])
np.testing.assert_almost_equal(output_arr, expected_output, decimal=2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With