In the following Keras and Tensorflow implementations of the training of a neural network, how model.train_on_batch([x], [y])
in the keras implementation is different than sess.run([train_optimizer, cross_entropy, accuracy_op], feed_dict=feed_dict)
in the Tensorflow implementation? In particular: how those two lines can lead to different computation in training?:
keras_version.py
input_x = Input(shape=input_shape, name="x")
c = Dense(num_classes, activation="softmax")(input_x)
model = Model([input_x], [c])
opt = Adam(lr)
model.compile(loss=['categorical_crossentropy'], optimizer=opt)
nb_batchs = int(len(x_train)/batch_size)
for epoch in range(epochs):
loss = 0.0
for batch in range(nb_batchs):
x = x_train[batch*batch_size:(batch+1)*batch_size]
y = y_train[batch*batch_size:(batch+1)*batch_size]
loss_batch, acc_batch = model.train_on_batch([x], [y])
loss += loss_batch
print(epoch, loss / nb_batchs)
tensorflow_version.py
input_x = Input(shape=input_shape, name="x")
c = Dense(num_classes)(input_x)
input_y = tf.placeholder(tf.float32, shape=[None, num_classes], name="label")
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits_v2(labels=input_y, logits=c, name="xentropy"),
name="xentropy_mean"
)
train_optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(cross_entropy)
nb_batchs = int(len(x_train)/batch_size)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(epochs):
loss = 0.0
acc = 0.0
for batch in range(nb_batchs):
x = x_train[batch*batch_size:(batch+1)*batch_size]
y = y_train[batch*batch_size:(batch+1)*batch_size]
feed_dict = {input_x: x,
input_y: y}
_, loss_batch = sess.run([train_optimizer, cross_entropy], feed_dict=feed_dict)
loss += loss_batch
print(epoch, loss / nb_batchs)
Note: This question follows Same (?) model converges in Keras but not in Tensorflow , which have been considered too broad but in which I show exactly why I think those two statements are somehow different and lead to different computation.
Keras Compile Models Before training the model we need to compile it and define the loss function, optimizers, and metrics for prediction. We compile the model using . compile() method. Optimizer, loss, and metrics are the necessary arguments.
train_on_batch allows you to expressly update weights based on a collection of samples you provide, without regard to any fixed batch size. You would use this in cases when that is what you want: to train on an explicit collection of samples.
fit() and keras. fit_generator() in Python are two separate deep learning libraries which can be used to train our machine learning and deep learning models. Both these functions can do the same task, but when to use which function is the main question.
Keras is a neural network Application Programming Interface (API) for Python that is tightly integrated with TensorFlow, which is used to build machine learning models. Keras' models offer a simple, user-friendly way to define a neural network, which will then be built for you by TensorFlow.
Yes, the results can be different. The results shouldn't be surprising if you know the following things in advance:
corss-entropy
in Tensorflow and Keras is different. Tensorflow assumes the input to tf.nn.softmax_cross_entropy_with_logits_v2
as the raw unnormalized logits while Keras
accepts inputs as probabilitiesoptimizers
in Keras and Tensorflow are different.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With