Why sometimes tensorflow runs slower and slower with the process of training?

Question

I train a RNN network, the first epoch used 7.5 hours. But with the training process runs, tensorflow runs slower and slower, the second epoch used 55 hours. I checked the code, most APIs that become slower with time are these :

session.run([var1, var1, ...], feed_dict=feed),
tensor.eval(feed_dict=feed).

For example, one line code is session.run[var1, var2, ...], feed_dict=feed), as the program begins, It uses 0.1 seconds, but with the process runs, the time used for this line of code becomes bigger and bigger, After 10 hours, time this line spends comes to 10 seconds.

I have been befall this several times. Which triggered this? How could I do to avoid this?

If this line of code: self.shapes = [numpy.zeros(g[1].get_shape(), numy.float32) for g in self.compute_gradients] adds nodes to the graph of tensorflow? I suspect this maybe the reason. This line of code will be called many times periodically，and self is not an object of tf.train.optimizer.

Vincent Renkens · Accepted Answer

Try finalizing your graph after you create it (graph.finalize()). This will prevent operations to be added to the graph. I also think self.compute_gradients is adding operations to the graph. Try defining the operation outside your loop and running it inside your loop

Why sometimes tensorflow runs slower and slower with the process of training?

Tags:

tensorflow

HY G

1 Answers

Vincent Renkens

Recent Activity

Donate For Us

Why sometimes tensorflow runs slower and slower with the process of training?

Tags:

tensorflow

HY G

1 Answers

Vincent Renkens

Related questions

Recent Activity

Donate For Us