Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow training becomes slower and slower when iteration is more than 10,000. Why?

I feed the data to the graph with input pipeline methods, and tf.train.shuffle_batch is implemented to generate batch data. However, as the training progresses, tensorflow becomes slower and slower for later iterations. I am confused about what's the essential reason leading to it? Thanks very much! My code snippet is:

def main(argv=None):

# define network parameters
# weights
# bias

# define graph
# graph network

# define loss and optimization method
# data = inputpipeline('*')
# loss 
# optimizer

# Initializaing the variables
init = tf.initialize_all_variables()

# 'Saver' op to save and restore all the variables
saver = tf.train.Saver()

# Running session
print "Starting session... "
with tf.Session() as sess:

    # initialize the variables
    sess.run(init)

    # initialize the queue threads to start to shovel data
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    print "from the train set:"
    for i in range(train_set_size * epoch):
        _, d, pre = sess.run([optimizer, depth_loss, prediction])

    print "Training Finished!"

    # Save the variables to disk.
    save_path = saver.save(sess, model_path)
    print("Model saved in file: %s" % save_path)

    # stop our queue threads and properly close the session
    coord.request_stop()
    coord.join(threads)
    sess.close()
like image 258
Lei Avatar asked Dec 28 '16 01:12

Lei


1 Answers

When training you should do sess.run only once. Recommend trying something like this, hope it helps:

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  for i in range(train_set_size * epoch):
    train_step.run([optimizer, depth_loss, prediction])
like image 175
Sergio G Avatar answered Oct 30 '22 17:10

Sergio G