Output differences when changing order of batch(), shuffle() and repeat()

Tags:

I have created a tensorflow dataset, made it repeatable, shuffled it, divided it into batches, and have constructed an iterator to get the next batch. But when I do this, sometimes the elements are repetitive (within and among batches), especially for small datasets. Why?

430

asked Apr 19 '18 08:04

Miladiouss

2 Answers

Unlike what stated in your own answer, no, shuffling and then repeating won't fix your problems.

The key source of your problem is that you batch, then shuffle/repeat. That way, the items in your batches will always be taken from contiguous samples in the input dataset. Batching should be one of the last operations you do in your input pipeline.

Expanding the question slightly.

Now, there is a difference in the order in which you shuffle, repeat and batch, but it's not what you think. Quoting from the input pipeline performance guide:

If the repeat transformation is applied before the shuffle transformation, then the epoch boundaries are blurred. That is, certain elements can be repeated before other elements appear even once. On the other hand, if the shuffle transformation is applied before the repeat transformation, then performance might slow down at the beginning of each epoch related to initialization of the internal state of the shuffle transformation. In other words, the former (repeat before shuffle) provides better performance, while the latter (shuffle before repeat) provides stronger ordering guarantees.

Recapping

Repeat, then shuffle: you lose the guarantee that all samples are processed in one epoch.
Shuffle, then repeat: it is guaranteed that all samples will be processed before the next repeat begins, but you lose (slightly) in performance.

Whichever you choose, do that before batching.

answered Sep 30 '22 19:09

GPhilo

You must shuffle first, and then repeat!

As the following two codes show, the order of shuffling and repeating matters.

Worst ordering:

import tensorflow as tf

ds = tf.data.Dataset.range(10)
ds = ds.batch(2)
ds = ds.repeat()
ds = ds.shuffle(100000)
iterator   = ds.make_one_shot_iterator()
next_batch = iterator.get_next()

with tf.Session() as sess:
    for i in range(15):
        if i % (10//2) == 0:
            print("------------")
        print("{:02d}:".format(i), next_batch.eval())

Output:

------------
00: [6 7]
01: [2 3]
02: [6 7]
03: [0 1]
04: [8 9]
------------
05: [6 7]
06: [4 5]
07: [6 7]
08: [4 5]
09: [0 1]
------------
10: [2 3]
11: [0 1]
12: [0 1]
13: [2 3]
14: [4 5]

Bad Ordering:

import tensorflow as tf

ds = tf.data.Dataset.range(10)
ds = ds.batch(2)
ds = ds.shuffle(100000)
ds = ds.repeat()
iterator   = ds.make_one_shot_iterator()
next_batch = iterator.get_next()

with tf.Session() as sess:
    for i in range(15):
        if i % (10//2) == 0:
            print("------------")
        print("{:02d}:".format(i), next_batch.eval())

Output:

------------
00: [4 5]
01: [6 7]
02: [8 9]
03: [0 1]
04: [2 3]
------------
05: [0 1]
06: [4 5]
07: [8 9]
08: [2 3]
09: [6 7]
------------
10: [0 1]
11: [4 5]
12: [8 9]
13: [2 3]
14: [6 7]

Best Ordering:

Inspired by GPhilo answer, the order of batching also matter. For batches to be different in each epoch, one must shuffle first, then repeat, and finally batch. As it can be seen in the output, all batches are unique, unlike the other.

import tensorflow as tf

ds = tf.data.Dataset.range(10)

ds = ds.shuffle(100000)
ds = ds.repeat()
ds = ds.batch(2)

iterator   = ds.make_one_shot_iterator()
next_batch = iterator.get_next()

with tf.Session() as sess:
    for i in range(15):
        if i % (10//2) == 0:
            print("------------")
        print("{:02d}:".format(i), next_batch.eval())

Output:

------------
00: [2 5]
01: [1 8]
02: [9 6]
03: [3 4]
04: [7 0]
------------
05: [4 3]
06: [0 2]
07: [1 9]
08: [6 5]
09: [8 7]
------------
10: [7 3]
11: [5 9]
12: [4 1]
13: [8 6]
14: [0 2]

answered Sep 30 '22 18:09

Miladiouss

Related questions
                            
                                Assign op in TensorFlow: what is the return value?
                            
                                How to convert a list of tensors of dim N to a tensor of dim N+1
                            
                                closing session in tensorflow doesn't reset graph
                            
                                Does applying a Dropout Layer after the Embedding Layer have the same effect as applying the dropout through the LSTM dropout parameter?
                            
                                Error: Module 'tensorflow' has no attribute 'gfile' error while running tensorflow object detection api tutorial
                            
                                UnimplementedError: Fused conv implementation does not support grouped convolutions for now
                            
                                Assign a name to a tensor?
                            
                                Tensorflow Attempting to use uninitialized value AUC/AUC/auc/false_positives
                            
                                How to train and evaluate simultaneously in Object Detection API ?
                            
                                Understand Op Registration and Kernel Linking in TensorFlow
                            
                                What is TensorFlow Eager module for? [closed]
                            
                                Keras2 ImageDataGenerator or TensorFlow tf.data?
                            
                                How to solve nan loss?
                            
                                How to use hyperopt for hyperparameter optimization of Keras deep learning network?
                            
                                Failed to load the native TensorFlow runtime. Python 3.5.2
                            
                                Could not satisfy explicit device specification '/device:GPU:0' because no devices matching
                            
                                How can i use "leaky_relu" as an activation in Tensorflow "tf.layers.dense"?
                            
                                DuplicateFlagError when trying to train tensorflow object detection api on google collaboratory
                            
                                tf.loadModel is not a function
                            
                                How to understand the "Densely Connected Layer" section in tensorflow tutorial

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Output differences when changing order of batch(), shuffle() and repeat()

Tags:

tensorflow

tensorflow-datasets

Miladiouss

People also ask

2 Answers

Expanding the question slightly.

Recapping

GPhilo

You must shuffle first, and then repeat!

Worst ordering:

Bad Ordering:

Best Ordering:

Miladiouss

Recent Activity

Donate For Us