Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Feeding tensors for training vs validation data

In the tensorflow examples, feed_dict is used to send either training or validation input into the same model graph. Unfortunately, you cannot feed tensors:

    Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.

I've been using an input pipeline and TFRecordReader, so my data never really enters python. Having to call run to get the data into python just to feed it back to tensorflow seems silly and is definitely slow.

Does anyone have a good solution for this?

Currently, I just create two identical copies of the model subgraph that use the same parameters. This works, but forces me to organize my code in an odd way.

EDIT

For example, I'm currently doing something like:

model_params = BuildModelParams()
train_model = BuildModel(model_params, train_input)
test_model = BuildModel(model_params, test_input)

so that the test model uses the parameters learned by training. The nice thing about feed_dict is that I only need to define the model once and I do not have to separate the model's parameters from its structure.

like image 764
Vince Gatto Avatar asked Feb 20 '16 01:02

Vince Gatto


People also ask

What is validation data in model fit?

Validation Dataset: The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters. The evaluation becomes more biased as skill on the validation dataset is incorporated into the model configuration.

How long does it take to train a Tensorflow model?

Training usually takes between 2-8 hours depending on the number of files and queued models for training.

What is TF constant?

tf. constant is useful for asserting that the value can be embedded that way. If the argument dtype is not specified, then the type is inferred from the type of value . # Constant 1-D Tensor from a python list.

Which API is used to build performant complex input pipelines from simple re usable pieces that will feed your model's training or evaluation loops?

Dataset API to build a pipeline for feeding data to your model. tf. data. Dataset is used to build performant, complex input pipelines from simple, re-usable pieces that will feed your model's training or evaluation loops.


1 Answers

Warning:

This solution can cause significant problems when input queues are involved. See: https://groups.google.com/a/tensorflow.org/forum/#!msg/discuss/mLrt5qc9_uU/sGNbC7GpAwAJ

Thanks to @fwalch for pointing this out in the comments


There's no way to do exactly what you're asking, see the answer to my question here.

But the newly public "cond" from version 0.7 can fill your use case:

# Here are the two data streams.
train_data = tf.Variable(999)
test_data = tf.Variable(1000)

# This selects which stream to use.
select_test = tf.placeholder(dtype=bool,shape=[],name='select_test')
data = tf.cond(
    select_test,
    lambda:test_data,
    lambda:train_data
)

# Here is the model.
model = data-500;

init = tf.initialize_all_variables()
with tf.Session():
    init.run()

    # You just have to feed `select_test` when you evaluate the model.
    print(model.eval({select_test:False})) #499
    print(model.eval({select_test:True})) #500

You can use the same sort of trick for switching Batch Normalization to use a moving average during test.

like image 112
mdaoust Avatar answered Nov 11 '22 02:11

mdaoust