I have an application with an input pipeline that uses a <code>tf.data.Dataset</code> of images and labels. Now I would like to use augmentations, and I'm trying to use the imgaug library for that purpose. However, I do not know how to do that. All the examples I have found use Keras <code>ImageDataGenerator</code> or <code>Sequence</code>. In code, given a sequential augmenter like this: <pre class="prettyprint"><code> self.augmenter = iaa.Sequential([ iaa.Fliplr(config.sometimes), iaa.Crop(percent=config.crop_percent), ... ], random_order=config.random_order) </code></pre> I am trying to apply that augmenter to batches of images in my dataset, without success. It seems that I cannot eval tensors since I'm running my augmentations inside a map function. <pre class="prettyprint"><code>def augment_dataset(self, dataset): dataset = dataset.map(self.augment_fn()) return dataset def augment_fn(self): def augment(images, labels): img_array = tf.make_ndarray(images) images = self.augmenter.augment_images(img_array) return images, labels return augment </code></pre> For example, If I try to use make_ndarray I get an <code>AttributeError</code>: <code>'Tensor' object has no attribute 'tensor_shape'</code> Is this due to <code>Dataset.map</code> not using eager mode?. Any ideas on how to approach this? <h3>Update #1</h3> I tried the suggested <code>tf.numpy_function</code>, as follows: <pre class="prettyprint"><code>def augment_fn(self): def augment(images, labels): images = tf.numpy_function(self.augmenter.augment_images, [images], images.dtype) return images, labels return augment </code></pre> However, the resulting images have an unknown shape, which results in other errors later on. How can I keep the original shape of images? Before applying the augmentation function my batch of images have shape <code>(batch_size, None, None, 1)</code>, but afterwards shape is <code><unknown></code>. <h3>Update #2</h3> I solved the issue with the unknown shape by first finding the dynamic (true) shape of the images and then reshaping the result of applying the augmentation. <pre class="prettyprint"><code>def augment_fn(self): def augment(images, labels): img_dtype = images.dtype img_shape = tf.shape(images) images = tf.numpy_function(self.augmenter.augment_images, [images], img_dtype) images = tf.reshape(images, shape = img_shape) return images, labels return augment </code></pre>

Please go to the TF Dataset documentation to see why you need to return shapes of your images when you are using <code>tf.py_function</code> . <pre class="prettyprint"><code>def tf_random_rotate_image(image, label): im_shape = image.shape [image,] = tf.py_function(random_rotate_image, [image], [tf.float32]) image.set_shape(im_shape) return image, label </code></pre>

<blockquote> Is this due to non using eager mode? I thought Eager mode was default in TF2.0. Any ideas on how to approach this? </blockquote> Yes, Dataset pre-processing is not executed in eager mode. This is, I assume, deliberate and certainly makes sense if you consider that Datasets can represent arbitrarily large (even infinite) streams of data. Assuming that it is not possible/practical for you to translate the augmentation you are doing to tensorflow operations (which would be the first choice!) then you can use <code>tf.numpy_function</code> to execute arbitrary python code (this is a replacement for the now deprecated <code>tf.py_func</code>)

how to apply imgaug augmentation to tf.dataDataset in Tensorflow 2.0

Tags:

tensorflow

tensorflow2.0

tensorflow-datasets

I have an application with an input pipeline that uses a tf.data.Dataset of images and labels. Now I would like to use augmentations, and I'm trying to use the imgaug library for that purpose. However, I do not know how to do that. All the examples I have found use Keras ImageDataGenerator or Sequence.

In code, given a sequential augmenter like this:

  self.augmenter = iaa.Sequential([
        iaa.Fliplr(config.sometimes),
        iaa.Crop(percent=config.crop_percent),
        ...
        ], random_order=config.random_order)

I am trying to apply that augmenter to batches of images in my dataset, without success. It seems that I cannot eval tensors since I'm running my augmentations inside a map function.

def augment_dataset(self, dataset):
    dataset = dataset.map(self.augment_fn())
    return dataset

def augment_fn(self):
    def augment(images, labels):
        img_array = tf.make_ndarray(images)
        images = self.augmenter.augment_images(img_array) 
        return images, labels
    return augment

For example, If I try to use make_ndarray I get an AttributeError: 'Tensor' object has no attribute 'tensor_shape'

Is this due to Dataset.map not using eager mode?. Any ideas on how to approach this?

Update #1

I tried the suggested tf.numpy_function, as follows:

def augment_fn(self):
    def augment(images, labels):
        images = tf.numpy_function(self.augmenter.augment_images,
                                   [images],
                                   images.dtype)
        return images, labels
    return augment

However, the resulting images have an unknown shape, which results in other errors later on. How can I keep the original shape of images? Before applying the augmentation function my batch of images have shape (batch_size, None, None, 1), but afterwards shape is <unknown>.

Update #2

I solved the issue with the unknown shape by first finding the dynamic (true) shape of the images and then reshaping the result of applying the augmentation.

def augment_fn(self):
    def augment(images, labels):
        img_dtype = images.dtype
        img_shape = tf.shape(images)
        images = tf.numpy_function(self.augmenter.augment_images,
                                   [images],
                                   img_dtype)
        images = tf.reshape(images, shape = img_shape)
        return images, labels
    return augment

359

asked Aug 06 '19 11:08

magomar

2 Answers

Please go to the TF Dataset documentation to see why you need to return shapes of your images when you are using tf.py_function .

def tf_random_rotate_image(image, label):
    im_shape = image.shape
    [image,] = tf.py_function(random_rotate_image, [image], [tf.float32])
    image.set_shape(im_shape)
    return image, label

128

answered Nov 02 '22 01:11

Oscar Fabio Tokunaga Herrera

Is this due to non using eager mode? I thought Eager mode was default in TF2.0. Any ideas on how to approach this?

Yes, Dataset pre-processing is not executed in eager mode. This is, I assume, deliberate and certainly makes sense if you consider that Datasets can represent arbitrarily large (even infinite) streams of data.

Assuming that it is not possible/practical for you to translate the augmentation you are doing to tensorflow operations (which would be the first choice!) then you can use tf.numpy_function to execute arbitrary python code (this is a replacement for the now deprecated tf.py_func)

answered Nov 02 '22 01:11

Stewart_R

Related questions
                            
                                Anaconda Integration with Cuda 9.0 shows Incompatible Package Error
                            
                                Multilabel image classification with sparse labels in TensorFlow?
                            
                                Tensorflow, try and except doesn't handle exception
                            
                                How to use numpy functions on a keras tensor in the loss function?
                            
                                TensorFlow Horovod: NCCL and MPI
                            
                                How to invoke the Flex delegate for tflite interpreters?
                            
                                Issue with embedding layer when serving a Tensorflow/Keras model with TF 2.0
                            
                                Tensorflow: Finetune pretrained model on new dataset with different number of classes
                            
                                What is the best way to save tensor value to file as binary format?
                            
                                Segmentation fault (core dumped) on tf.Session()
                            
                                Specify either CPU or GPU for multiple models tensorflow java's job
                            
                                Keras with Tensorflow: Use memory as it's needed [ResourceExhaustedError]
                            
                                Set half of the filters of a layer as not trainable keras/tensorflow
                            
                                Specifying CPUs for use in Keras Tensorflow Model Inference
                            
                                Why is AdamOptimizer duplicated in my graph?
                            
                                Support for Tensorflow 2.0 in Object Detection API
                            
                                How to generate .pbtxt file from a .pb file for dnn module in opencv?
                            
                                Evaluating TF model inside a TF op throws error
                            
                                When training GANs in Keras, are multiple passes required to optimize the generator and discriminator?
                            
                                tensorflow 2.0: An op outside of the function building code is being passed

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With