An optimizer typically run the same computation graph for many steps until convergence. Does tensorflow setup the graph at the beginning and reuse it for every step? What if I change the batch size during training? What if I make some minus change to the graph like changing the loss function? What if I made some major change to the graph? Does tensorflow pre-generate all possible graphs? Does tensorflow know how to optimize the entire computation when the graph changes?
As keveman says, from the client's perspective there is a single TensorFlow graph. In the runtime, there can be multiple pruned subgraphs that contain just the nodes that are necessary to compute the values t1
, t2
etc. that you fetch when calling sess.run([t1, t2, ...])
.
If you call sess.run([t1, t2])
will prune the overall graph (sess.graph
) down to the subgraph required to compute those values: i.e. the operations that produce t1
and t2
and all of their antecedents. If you subsequently call sess.run([t3, t4])
, the runtime will prune the graph down to the subgraph required to compute t3
and t4
. Each time you pass a new combination of values to fetch, TensorFlow will compute a new pruned graph and cache it—this is why the first sess.run()
can be somewhat slower than subsequent ones.
If the pruned graphs overlap, TensorFlow will reuse the "kernel" for the ops that are shared. This is relevant because some ops (e.g. tf.Variable
and tf.FIFOQueue
) are stateful, and their contents can be used in both pruned graphs. This allows you, for example, to initialize your variables with one subgraph (e.g. sess.run(tf.initialize_all_variables())
), train them with another (e.g. sess.run(train_op)
), and evaluate your model with a third (e.g. sess.run(loss, feed_dict={x: ...})
). It also lets you enqueue elements to a queue with one subgraph, and dequeue them with another, which is the foundation of the input pipelines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With