I just read this article. The article says that the resize algorithm of tensorflow has some bugs. Now I want to use scipy.misc.imresize
instead of tf.image.resize_images
. And I wonder what is the best way to implement the scipy resize algorithm.
Let`s consider the following layer:
def up_sample(input_tensor, new_height, new_width):
_up_sampled = tf.image.resize_images(input_tensor, [new_height, new_width])
_conv = tf.layers.conv2d(_up_sampled, 32, [3,3], padding="SAME")
return _conv
How can I use the scipy algorithm in this layer?
Edit:
An example can be this function:
input_tensor = tf.placeholder("float32", [10, 200, 200, 8])
output_shape = [32, 210, 210, 8]
def up_sample(input_tensor, output_shape):
new_array = np.zeros(output_shape)
for batch in range(input_tensor.shape[0]):
for channel in range(input_tensor.shape[-1]):
new_array[batch, :, :, channel] = misc.imresize(input_tensor[batch, :, :, channel], output_shape[1:3])
But obviously scipy raises a ValueError that the the tf.Tensor object does not have the right shape. I read that during the a tf.Session the Tensors are accessible as numpy arrays. How can I use the scipy function only during a session and omit the execution in when creating the protocol buffer?
And is there a faster way than looping over all batches and channels?
Generally speaking, the tools you need are a combination of tf.map_fn
and tf.py_func
.
tf.py_func
allows you to wrap a standard python function into a tensorflow op that is inserted into your graph.tf.map_fn
allows you to call a function repeatedly on the batch samples, when the function cannot operate on the whole batch — as it is often the case with image functions.In the present case, I would probably advise to use scipy.ndimage.zoom
on the basis that it can operate directly on the 4D tensor, which makes things simpler. On the other hand, it takes as input zoom factors, not sizes, so we need to compute them.
import tensorflow as tf
sess = tf.InteractiveSession()
# unimportant -- just a way to get an input tensor
batch_size = 13
im_size = 7
num_channel=5
x = tf.eye(im_size)[None,...,None] + tf.zeros((batch_size, 1, 1, num_channel))
new_size = 17
from scipy import ndimage
new_x = tf.py_func(
lambda a: ndimage.zoom(a, (1, new_size/im_size, new_size/im_size, 1)),
[x], [tf.float32], stateful=False)[0]
print(new_x.eval().shape)
# (13, 17, 17, 5)
You could use other functions (e.g. OpenCV's cv2.resize
, Scikit-image's transform.image
, Scipy's misc.imresize
) but none can operate directly on 4D tensors and therefore are more verbose to use. You may still want to use them if you want an interpolation other than zoom
's spline-based interpolation.
However, be aware of the following things:
Python functions are executed on the host. So, if you are executing your graph on a device like a graphics card, it needs to stop, copy the tensor to host memory, call your function, then copy the result back on the device. This can completely ruin your computation time if memory transfers are important.
Gradients do not pass through python functions. If your node is used, say, in an upscaling part of a network, layers upstream will not receive any gradient (or only part of it, if you have skip connections), which would compromise your training.
For those two reasons, I would advise to apply this kind of resampling to inputs only, when preprocessed on CPU and gradients are not used.
If you do want to use this upscale node for training on the device, then I see no alternative as to either stick with the buggy tf.image.resize_image
, or to write your own.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With