I'm trying to take random crops from an image. Just like it's done in Caffe for the sake of data augmentation.
I know that tensorflow already has a function
img = tf.random_crop(img, [h, w, 3])
label = tf.random_crop(label, [h, w, 1])
But I'm not sure whether it takes it takes the same crop for image and label. Also this function cannot automatically 0-pad images with one or two dimensions smaller than the crop size [h,w].
That again is done by
img = tf.image.resize_image_with_crop_or_pad(img, h, w)
label = tf.image.resize_image_with_crop_or_pad(label, h, w)
But it takes only center crops not random crops.
Edit:
Here is some code how the padding could be done:
# Cropping dimensions (crops of 700 x 800)
crp_h = tf.constant(700)
crp_w = tf.constant(800)
shape = tf.shape(img)
img_h = shape[0]
img_w = shape[1]
img = tf.cond(img_h < crp_h, lambda: tf.image.pad_to_bounding_box(img, 0, 0, crp_h, img_w), lambda: img)
# Update image dimensions
shape = tf.shape(img)
img_h = shape[0]
img = tf.cond(img_w < crp_w, lambda: tf.image.pad_to_bounding_box(img, 0, 0, img_h, crp_w), lambda: img)
# Update image dimensions
shape = tf.shape(img)
img_w = shape[1]
Unfortunately one cannot use the python if
conditional here so one has to go with the ugly tf.cond(...)
instead.
I would suggest combining the image with the labels and randomly cropping them together:
import tensorflow as tf
def random_crop_and_pad_image_and_labels(image, labels, size):
"""Randomly crops `image` together with `labels`.
Args:
image: A Tensor with shape [D_1, ..., D_K, N]
labels: A Tensor with shape [D_1, ..., D_K, M]
size: A Tensor with shape [K] indicating the crop size.
Returns:
A tuple of (cropped_image, cropped_label).
"""
combined = tf.concat([image, labels], axis=2)
image_shape = tf.shape(image)
combined_pad = tf.image.pad_to_bounding_box(
combined, 0, 0,
tf.maximum(size[0], image_shape[0]),
tf.maximum(size[1], image_shape[1]))
last_label_dim = tf.shape(labels)[-1]
last_image_dim = tf.shape(image)[-1]
combined_crop = tf.random_crop(
combined_pad,
size=tf.concat([size, [last_label_dim + last_image_dim]],
axis=0))
return (combined_crop[:, :, :last_image_dim],
combined_crop[:, :, last_image_dim:])
As an example:
cropped_image, cropped_labels = random_crop_and_pad_image_and_labels(
image=tf.reshape(tf.range(4*4*3), [4, 4, 3]),
labels=tf.reshape(tf.range(4*4), [4, 4, 1]),
size=[2, 2])
with tf.Session() as session:
print(session.run([cropped_image, cropped_labels]))
Prints something like:
[array([[[30, 31, 32],
[33, 34, 35]],
[[42, 43, 44],
[45, 46, 47]]], dtype=int32), array([[[10],
[11]],
[[14],
[15]]], dtype=int32)]
And a second example with an under-sized image:
cropped_image, cropped_labels = random_crop_and_pad_image_and_labels(
image=tf.reshape(tf.range(4*1*3), [4, 1, 3]),
labels=tf.reshape(tf.range(4*1), [4, 1, 1]),
size=[2, 2])
with tf.Session() as session:
print(session.run([cropped_image, cropped_labels]))
Prints:
[array([[[3, 4, 5],
[0, 0, 0]],
[[6, 7, 8],
[0, 0, 0]]], dtype=int32), array([[[1],
[0]],
[[2],
[0]]], dtype=int32)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With