I am building a simple Sequential model in Keras (tensorflow backend). During training I want to inspect the individual training batches and model predictions. Therefore, I am trying to create a custom Callback
that saves the model predictions and targets for each training batch. However, the model is not using the current batch for prediction, but the entire training data.
How can I hand over only the current training batch to the Callback
?
And how can I access the batches and targets that the Callback
saves in self.predhis and self.targets?
My current version looks as follows:
callback_list = [prediction_history((self.x_train, self.y_train))] self.model.fit(self.x_train, self.y_train, batch_size=self.batch_size, epochs=self.n_epochs, validation_data=(self.x_val, self.y_val), callbacks=callback_list) class prediction_history(keras.callbacks.Callback): def __init__(self, train_data): self.train_data = train_data self.predhis = [] self.targets = [] def on_batch_end(self, epoch, logs={}): x_train, y_train = self.train_data self.targets.append(y_train) prediction = self.model.predict(x_train) self.predhis.append(prediction) tf.logging.info("Prediction shape: {}".format(prediction.shape)) tf.logging.info("Targets shape: {}".format(y_train.shape))
There are two formats you can use to save an entire model to disk: the TensorFlow SavedModel format, and the older Keras H5 format. The recommended format is SavedModel. It is the default when you use model.save() .
Let's say for example, after epoch = 150 is over, it will be saved as model. save(model_1. h5) and after epoch = 152 , it will be saved as model. save(model_2.
A callback is an object that can perform actions at various stages of training (e.g. at the start or end of an epoch, before or after a single batch, etc). You can use callbacks to: Write TensorBoard logs after every batch of training to monitor your metrics. Periodically save your model to disk.
To save weights every epoch, you can use something known as callbacks in Keras. checkpoint = ModelCheckpoint(.....) , assign the argument 'period' as 1 which assigns the periodicity of epochs. This should do it.
NOTE: this answer is outdated and only works with TF1. Check @bers's answer for a solution tested on TF2.
After model compilation, the placeholder tensor for y_true
is in model.targets
and y_pred
is in model.outputs
.
To save the values of these placeholders at each batch, you can:
on_batch_end
, and store the resulting arrays.Now step 1 is a bit involved because you'll have to add an tf.assign
op to the training function model.train_function
. Using current Keras API, this can be done by providing a fetches
argument to K.function()
when the training function is constructed.
In model._make_train_function()
, there's a line:
self.train_function = K.function(inputs, [self.total_loss] + self.metrics_tensors, updates=updates, name='train_function', **self._function_kwargs)
The fetches
argument containing the tf.assign
ops can be provided via model._function_kwargs
(only works after Keras 2.1.0).
As an example:
from keras.layers import Dense from keras.models import Sequential from keras.callbacks import Callback from keras import backend as K import tensorflow as tf import numpy as np class CollectOutputAndTarget(Callback): def __init__(self): super(CollectOutputAndTarget, self).__init__() self.targets = [] # collect y_true batches self.outputs = [] # collect y_pred batches # the shape of these 2 variables will change according to batch shape # to handle the "last batch", specify `validate_shape=False` self.var_y_true = tf.Variable(0., validate_shape=False) self.var_y_pred = tf.Variable(0., validate_shape=False) def on_batch_end(self, batch, logs=None): # evaluate the variables and save them into lists self.targets.append(K.eval(self.var_y_true)) self.outputs.append(K.eval(self.var_y_pred)) # build a simple model # have to compile first for model.targets and model.outputs to be prepared model = Sequential([Dense(5, input_shape=(10,))]) model.compile(loss='mse', optimizer='adam') # initialize the variables and the `tf.assign` ops cbk = CollectOutputAndTarget() fetches = [tf.assign(cbk.var_y_true, model.targets[0], validate_shape=False), tf.assign(cbk.var_y_pred, model.outputs[0], validate_shape=False)] model._function_kwargs = {'fetches': fetches} # use `model._function_kwargs` if using `Model` instead of `Sequential` # fit the model and check results X = np.random.rand(10, 10) Y = np.random.rand(10, 5) model.fit(X, Y, batch_size=8, callbacks=[cbk])
Unless the number of samples can be divided by the batch size, the final batch will have a different size than other batches. So K.variable()
and K.update()
can't be used in this case. You'll have to use tf.Variable(..., validate_shape=False)
and tf.assign(..., validate_shape=False)
instead.
To verify the correctness of the saved arrays, you can add one line in training.py
to print out the shuffled index array:
if shuffle == 'batch': index_array = _batch_shuffle(index_array, batch_size) elif shuffle: np.random.shuffle(index_array) print('Index array:', repr(index_array)) # Add this line batches = _make_batches(num_train_samples, batch_size)
The shuffled index array should be printed out during fitting:
Epoch 1/1 Index array: array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2]) 10/10 [==============================] - 0s 23ms/step - loss: 0.5670
And you can check if cbk.targets
is the same as Y[index_array]
:
index_array = np.array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2]) print(Y[index_array]) [[ 0.75325592 0.64857277 0.1926653 0.7642865 0.38901153] [ 0.77567689 0.13573623 0.4902501 0.42897559 0.55825652] [ 0.33760938 0.68195038 0.12303088 0.83509441 0.20991668] [ 0.98367778 0.61325065 0.28973401 0.28734073 0.93399794] [ 0.26097574 0.88219054 0.87951941 0.64887846 0.41996446] [ 0.97794604 0.91307569 0.93816428 0.2125808 0.94381495] [ 0.74813435 0.08036688 0.38094272 0.83178364 0.16713736] [ 0.52609421 0.39218962 0.21022047 0.58569125 0.08012982] [ 0.61276627 0.20679494 0.24124858 0.01262245 0.0994412 ] [ 0.6026137 0.25620512 0.7398164 0.52558182 0.09955769]] print(cbk.targets) [array([[ 0.7532559 , 0.64857274, 0.19266529, 0.76428652, 0.38901153], [ 0.77567691, 0.13573623, 0.49025011, 0.42897558, 0.55825651], [ 0.33760938, 0.68195039, 0.12303089, 0.83509439, 0.20991668], [ 0.9836778 , 0.61325067, 0.28973401, 0.28734073, 0.93399793], [ 0.26097575, 0.88219053, 0.8795194 , 0.64887846, 0.41996446], [ 0.97794604, 0.91307569, 0.93816429, 0.2125808 , 0.94381493], [ 0.74813437, 0.08036689, 0.38094273, 0.83178365, 0.16713737], [ 0.5260942 , 0.39218962, 0.21022047, 0.58569127, 0.08012982]], dtype=float32), array([[ 0.61276627, 0.20679495, 0.24124858, 0.01262245, 0.0994412 ], [ 0.60261369, 0.25620511, 0.73981643, 0.52558184, 0.09955769]], dtype=float32)]
As you can see, there are two batches in cbk.targets
(one "full batch" of size 8 and the final batch of size 2), and the row order is the same as Y[index_array]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With