tf.contrib.data.Dataset repeat with shuffle, notice epoch end, mixed epochs?

Tags:

tensorflow

About the tf.contrib.data.Dataset (from TensorFlow 1.2, see here and here) usage: When I use repeat (for multiple epochs) together with shuffle (as read_batch_features does internally), how will I notice when some epochs ends, and what the current epoch is? Also, when the epoch ends, will the ShuffleDataset wait first to dequeue everything or will it already be filled with more data from the next epoch? In the last epoch, or if I don't use repeat, will the ShuffleDataset dequeue all remaining data, like tf.RandomShuffleQueue dequeueing does after close?

My current solution, which also gives me more control: I would not use repeat but go once over the data and use ShuffleDataset to get shuffling like RandomShuffleQueue, and then at some point I get OutOfRangeError and I know that I reached the end of the epoch. Then I reinitializable the iterator, like it is described here.

731

asked May 23 '17 10:05

Albert

1 Answers

The behavior of Dataset.shuffle() depends on where in your pipeline it appears relative to the Dataset.repeat():

If you shuffle before the repeat, the sequence of outputs will first produce all records from epoch i, before any record from epoch i + 1.
If you shuffle after the repeat, the sequence of outputs may produce records from epoch i before or after epoch i + 1 (and, epoch i + k, with probability that increases with the buffer_size and decreases with k).

If you want to perform some computation between epochs, and avoid mixing data from different epochs, it is probably easiest to avoid repeat() and catch the OutOfRangeError at the end of each epoch.

There are some more interesting pipelines you could build to track the epoch number. For example, you could encode an epoch number as a component of each element:

dataset = (
    Dataset.range(None).flat_map(lambda epoch_num: 
        Dataset.zip(
            (Dataset.from_tensors(epoch_num).repeat(),  # Infinite repeat of `epoch_num`.
             ...,  # Definition of a Dataset over a single epoch.
            )
        )
    )
)

...where ... is the expression that defines a Dataset for a single epoch, and includes batching and shuffling.

170

answered Oct 16 '22 20:10

mrry

Related questions
                            
                                Keras seems to hang after call to fit_generator
                            
                                Dependencies missing in current linux-64 channels when trying to install tensorflow-gpu with conda command
                            
                                logits and labels must be broadcastable: logits_size=[32,1] labels_size=[16,1]
                            
                                How to use F-score as error function to train neural networks?
                            
                                TensorFlow Object Detection API: specifying multiple data_augmentation_options
                            
                                Why not use Flatten followed by a Dense layer instead of TimeDistributed?
                            
                                NVidia drivers stopped working on AWS EC2 instance with Ubuntu 16.04 and Tesla K80 GPU
                            
                                LSTM Keras input shape confusion
                            
                                tensorflow transition to gpu version
                            
                                Python kernel dies on Jupyter Notebook with tensorflow 2
                            
                                Save history of model.fit for different epochs
                            
                                ModuleNotFoundError: No module named 'tf'
                            
                                Jupyter Notebook : 'head' is not recognized as an internal or external command, operable program or batch file
                            
                                `return_sequences = False` equivalent in pytorch LSTM
                            
                                I have this error when trying to import tensorflow_hub: cannot import name 'parameter_server_strategy_v2' from 'tensorflow.python.distribute'
                            
                                What exactly is Keras's CategoricalCrossEntropy doing?
                            
                                Why do I get ValueError('\'image\' must be fully defined.') when transforming image in Tensorflow?
                            
                                Tensorflow, missing checkpoint files. Does saver only allow for keeping 5 check points?
                            
                                Tensorflow checkpoint models getting deleted
                            
                                Error with tensorflow 1.0 mnist code

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With