The MNIST set consists of 60,000 images for training set. While training my Tensorflow, I want to run the train step to train the model with the entire training set. The deep learning example on the Tensorflow website uses 20,000 iterations with a batch size of 50 (totaling to 1,000,000 batches). When I try more than 30,000 iterations, my number predictions fail (predicts 0 for all handwritten numbers). My questions is, how many iterations should I use with a batch size of 50 to train the tensorflow model with the entire MNIST set?
self.mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
for i in range(FLAGS.training_steps):
batch = self.mnist.train.next_batch(50)
self.train_step.run(feed_dict={self.x: batch[0], self.y_: batch[1], self.keep_prob: 0.5})
if (i+1)%1000 == 0:
saver.save(self.sess, FLAGS.checkpoint_dir + 'model.ckpt', global_step = i)
With Machine learning you tend to have serious cases of diminishing returns. for example here is a list of accuracy from one of my CNNs:
Epoch 0 current test set accuracy : 0.5399
Epoch 1 current test set accuracy : 0.7298
Epoch 2 current test set accuracy : 0.7987
Epoch 3 current test set accuracy : 0.8331
Epoch 4 current test set accuracy : 0.8544
Epoch 5 current test set accuracy : 0.8711
Epoch 6 current test set accuracy : 0.888
Epoch 7 current test set accuracy : 0.8969
Epoch 8 current test set accuracy : 0.9064
Epoch 9 current test set accuracy : 0.9148
Epoch 10 current test set accuracy : 0.9203
Epoch 11 current test set accuracy : 0.9233
Epoch 12 current test set accuracy : 0.929
Epoch 13 current test set accuracy : 0.9334
Epoch 14 current test set accuracy : 0.9358
Epoch 15 current test set accuracy : 0.9395
Epoch 16 current test set accuracy : 0.942
Epoch 17 current test set accuracy : 0.9436
Epoch 18 current test set accuracy : 0.9458
As you can see the returns start to fall off after ~10 Epochs*, however this may vary based on your network and learning rate. Based on how critical/ how much time you have the amount that is good to do varies, but I have found 20 to be a reasonable number
*I have always used the word epoch to mean one entire run through a data set but i am unaware as to the accuracy of that definition, each epoch here is ~429 training steps with batches of size 128.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With