Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow minibatch training

Tags:

tensorflow

How can I train a network in TensorFlow using minibatches of data? In the Deep-MNIST tutorial, they use:

for i in range(1000):
   batch = mnist.train.next_batch(50)
   train_step.run(feed_dict={x: batch[0], y_: batch[1]})

My question is - are x and y_ variables with dimensions suitable to a single example, and batch[0],batch[1] are lists of such inputs and outputs? in this case, will TensorFlow automatically add the gradients for each training example in these lists? or should I create my model so that x and y_ get an entire minibatch?

My problem is that when I try to feed it a list for each placeholder, it tries to input the entire list for the placeholder, and I therefore get a size mismatch: Cannot feed value of shape (n, m) for Tensor u'ts:0', which has shape '(m,)', where n is the minibatch size and m is the individual input size.

Thanks.

like image 268
yoki Avatar asked Jul 02 '16 07:07

yoki


People also ask

What is a good Minibatch size?

The results confirm that using small batch sizes achieves the best generalization performance, for a given computation cost. In all cases, the best results have been obtained with batch sizes of 32 or smaller. Often mini-batch sizes as small as 2 or 4 deliver optimal results.

What is the difference between batch and Minibatch?

Batch means that you use all your data to compute the gradient during one iteration. Mini-batch means you only take a subset of all your data during one iteration.

Does increasing batch size speed up training?

On the opposite, big batch size can really speed up your training, and even have better generalization performances. A good way to know which batch size would be good, is by using the Simple Noise Scale metric introduced in “ An Empirical Model of Large-Batch Training”.


1 Answers

In the MNIST tutorial x and y_are placeholders with a defined shape:

x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

The shape=[None, 784] means that this placeholder have 2 dimension.

So, to answer your first question:

are x and y_ variables with dimensions suitable to a single example

The first dimension can contain an undefined number of elements (so, 1, 2, ... 50 ...) and the second dimension can contain exaclly 784 = 28*28 elements (that are the features of a single MNIST image).

If you feed the graph with a python list with shape [1, 784] or [50, 784] is totally the same for tensorflow, it can handle it without any problem.

batch[0],batch[1] are lists of such inputs and outputs? in the tutorial they define batch calling batch = datasets.train.next_batch(50). Thus:

  • batch[0] is a list with shape [50, 784]
  • batch[1] is a list with shape [50, 10]

will TensorFlow automatically add the gradients for each training example in these lists? or should I create my model so that x and y_ get an entire minibatch?

Tensorflow handles this for you.

The error you're reporting Cannot feed value of shape (n, m) for Tensor u'ts:0', which has shape '(m,)' is a shape mismatch error.

You're not reshaping the inputs to have the same shape of the placeholder.

like image 81
nessuno Avatar answered Oct 02 '22 10:10

nessuno