I'm following this tutorial for tensorflow:
I'm trying to understand the arguments to tf.session.run()
. I understand that you have to run operations in a graph in a session.
Is train_step
passed in because it encapsulates all the operations of the network in this particular example? I'm trying to understand why I don't need to pass any other variables to the session like cross_entropy
.
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
Here is the full code:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for _ in range(10):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
In a TensorFlow Session tf.Session
, you want to run (or execute) the optimizer operation (in this case it is train_step
). The optimizer minimizes your loss function (in this case cross_entropy
), which is evaluated or computed using the model hypothesis y
.
In the cascade approach, the cross_entropy
loss function minimizes the error made when computing y
, so it finds the best values of the weights W
that when combined with x
accurately approximates y
.
So using a TensorFlow Session object tf.Session
as sess
we run the optimizer train_step
, which then evaluates the entire Computational Graph.
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
Because the cascade approach ultimately calls cross_entropy
which makes use of the placeholders x
and y
, you have to use the feed_dict
to pass data to those placeholders.
As you mentioned, Tensorflow is used to build a graph of operations. Your train_step
operation (i.e. "minimize by gradient descent") is connected/depends on the result of cross_entropy
. cross_entropy
itself relies on the results of y
(softmax operation) and y_
(data assignment); etc.
When you are calling sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
, you are basically asking Tensorflow "run all the operations leading to train_step
, and return its result (with x = batch_xs
and y = batch_ys
for input)". So yes, Tensorflow will itself go through your graph backward to figure out the operation/input dependencies for train_step
, then execute all these operations forward, to return what you asked.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With