Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tensorflow Dataset API diff between make_initializable_iterator and make_one_shot_iterator

I want to know the difference between make_initializable_iterator and make_one_shot_iterator.
1. Tensorflow documentations said that A "one-shot" iterator does not currently support re-initialization. What exactly does that mean?
2. Are the following 2 snippets equivalent?
Use make_initializable_iterator

iterator = data_ds.make_initializable_iterator()
data_iter = iterator.get_next()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for e in range(1, epoch+1):
    sess.run(iterator.initializer)
    while True:
        try:
            x_train, y_train = sess.run([data_iter])
            _, cost = sess.run([train_op, loss_op], feed_dict={X: x_train,
                                                               Y: y_train})
        except tf.errors.OutOfRangeError:   
            break
sess.close()

Use make_one_shot_iterator

iterator = data_ds.make_one_shot_iterator()
data_iter = iterator.get_next()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for e in range(1, epoch+1):
    while True:
        try:
            x_train, y_train = sess.run([data_iter])
            _, cost = sess.run([train_op, loss_op], feed_dict={X: x_train,
                                                               Y: y_train})
        except tf.errors.OutOfRangeError:   
            break
sess.close()
like image 518
Lion Lai Avatar asked Jan 04 '18 08:01

Lion Lai


People also ask

What is a prefetch dataset?

Dataset. prefetch transformation. It can be used to decouple the time when data is produced from the time when data is consumed. In particular, the transformation uses a background thread and an internal buffer to prefetch elements from the input dataset ahead of the time they are requested.

How do I know what shape my TF dataset is?

To get the shape of a tensor, you can easily use the tf. shape() function. This method will help the user to return the shape of the given tensor. For example, suppose you have a tensor filled with integer numbers and you want to check the shape of the given input tensor.

What does TF data dataset from_tensor_slices do?

With that knowledge, from_tensors makes a dataset where each input tensor is like a row of your dataset, and from_tensor_slices makes a dataset where each input tensor is column of your data; so in the latter case all tensors must be the same length, and the elements (rows) of the resulting dataset are tuples with one ...

What is TF data dataset?

TensorFlow Datasets is a collection of datasets ready to use, with TensorFlow or other Python ML frameworks, such as Jax. All datasets are exposed as tf. data. Datasets , enabling easy-to-use and high-performance input pipelines. To get started see the guide and our list of datasets.


1 Answers

Suppose you want to use the same code to do your training and validation. You might like to use the same iterator, but initialized to point to different datasets; something like the following:

def _make_batch_iterator(filenames):
    dataset = tf.data.TFRecordDataset(filenames)
    ...
    return dataset.make_initializable_iterator()


filenames = tf.placeholder(tf.string, shape=[None])
iterator = _make_batch_iterator(filenames)

with tf.Session() as sess:
    for epoch in range(num_epochs):

        # Initialize iterator with training data
        sess.run(iterator.initializer,
                 feed_dict={filenames: ['training.tfrecord']})

        _train_model(...)

        # Re-initialize iterator with validation data
        sess.run(iterator.initializer,
                 feed_dict={filenames: ['validation.tfrecord']})

        _validate_model(...)

With a one-shot iterator, you can't re-initialize it like this.

like image 128
Scott Smith Avatar answered Oct 13 '22 21:10

Scott Smith