Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to augment data in tensorflow tfrecords?

I am storing my data using tfrecords and I read them as tensors using Dataset API and then I use the Estimator API to perform training. Now, I want to do online data-augmentation on each item in the dataset, but after trying for a while I cannot find a way out to do it. I want randomly flipping, randomly rotation and other manipulators.

I am following the instructions given in this tutorial with a custom estimator which is the my CNN and I am not sure where the data augmentation step occurs.

like image 600
deep_jandu Avatar asked Jan 19 '18 17:01

deep_jandu


People also ask

What are TensorFlow Records and how to read them?

In this post, I’m going to discuss Tensorflow Records. Tensorflow recommends to store and read data in tfRecords format. It internally uses Protocol Buffers to serialize/deserialize the data and store them in bytes, as it takes less space to hold an ample amount of data and to transfer them as well.

What is a tfrecord in TensorFlow?

A TFRecord is when a sequence of such records serializes to binary. The binary format takes less memory for storage in comparison to any other data formats. That’s what I’m going to do now. I will convert all the records of a dataset to TFRecords which can be serialized into binary and can be written in a file. Tensorflow says that,

What are tfrecords and how do they work?

This is where TFRecords (or large NumPy arrays, for that matter) come in handy: Instead of storing the data scattered around, forcing the disks to jump between blocks, we simply store the data in a sequential layout. We can visualize this concept in the following way: The TFRecord file can be seen as a wrapper around all the single data samples.

How to parse audio data from a tfrecord file?

As before, to create a dataset, we simply apply this parsing function to every element in the TFRecord file: To query our dataset, we then call this function and inspect the first element: The first entry is the shape of an audio file; the second entry is the corresponding label. That marks the end of working with audio data and TFRecord files.


1 Answers

Using TFRecords doesn't prevent you from doing data augmentation.

Following the tutorial you linked in your comment, here is what roughly happens:

  • You create the dataset from the TFRecords files, and parse the file to get an image and a label
dataset = tf.data.TFRecordDataset(filenames=filenames)
dataset = dataset.map(parse)
  • You can now apply a new preprocessing function to do some data augmentation during training
# Only do it when we are training
if train:
    dataset = dataset.map(train_preprocess)
  • The train_preprocess function can be something like this:
def train_preprocess(image, label):
    flip_image = tf.image.random_flip_left_right(image)
    # Other transformations...
    return flip_image, label
like image 179
Olivier Moindrot Avatar answered Oct 03 '22 21:10

Olivier Moindrot