Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow: Each iteration in training for-loop slower [duplicate]

I'm training a standard, simple multilayer perceptron ANN with three hidden layers in TensorFlow. I added a text progress bar so I could watch the progress of iterating through the epochs. What I'm finding is that the processing time per iteration increases after the first few epochs. Here's an example screenshot showing the increase with each iteration:

Execution time per iteration increases with # of iterations

In this case, the first few iterations took roughly 1.05s/it and by 100% it was taking 4.01s/it.

The relevant code is listed here:

# ------------------------- Build the TensorFlow Graph -------------------------

with tf.Graph().as_default():

    (a bunch of statements for specifying the graph)

# --------------------------------- Training ----------------------------------

    sess = tf.InteractiveSession()
    sess.run(tf.initialize_all_variables())

    print "Start Training"

    pbar = tqdm(total = training_epochs)
    for epoch in range(training_epochs):
        avg_cost = 0.0
    batch_iter = 0

    while batch_iter < batch_size:
        train_features = []
        train_labels = []
        batch_segments = random.sample(train_segments, 20)
        for segment in batch_segments:
            train_features.append(segment[0])
            train_labels.append(segment[1])
        sess.run(optimizer, feed_dict={x: train_features, y_: train_labels})
        line_out = "," + str(batch_iter) + "\n"
        train_outfile.write(line_out)
        line_out = ",," + str(sess.run(tf.reduce_mean(weights['h1']), feed_dict={x: train_features, y_: train_labels}))
        line_out += "," + str(sess.run(tf.reduce_mean(weights['h2']), feed_dict={x: train_features, y_: train_labels}))
        line_out += "," + str(sess.run(tf.reduce_mean(weights['h3']), feed_dict={x: train_features, y_: train_labels})) + "\n"
        train_outfile.write(line_out)
        avg_cost += sess.run(cost, feed_dict={x: train_features, y_: train_labels})/batch_size

        batch_iter += 1

    pbar.update(1)  # Increment the progress bar by one

train_outfile.close()
print "Completed training"

In searching stackoverflow, I found Processing time gets longer and longer after each iteration where someone else was also having problems with each iteration taking longer than the last. However, I believe mine may be different since they were clearly adding ops to the graph using statements like so:

distorted_image = tf.image.random_flip_left_right(image_tensor)

While I'm new to TensorFlow, I don't believe that I'm making the same mistake because the only stuff in my loop are sess.run() calls.

Any help is much appreciated.

like image 947
DojoGojira Avatar asked Dec 25 '22 05:12

DojoGojira


1 Answers

The three places where you have:

sess.run(tf.reduce_mean(weights['h1']), ...)

each append a new tf.reduce_mean() node to the graph at each iteration of the while loop, which adds overhead. Try to create them outside of the while loop:

with tf.Graph().as_default():
  ...
  m1 = tf.reduce_mean(weights['h1'])

while batch_iter < batch_size:
  ...
  line_out = ",," + str(sess.run(m1, feed_dict={x: train_features, y_: train_labels}))
like image 190
Vincent Vanhoucke Avatar answered Dec 26 '22 18:12

Vincent Vanhoucke