Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create keras callback to save model predictions and targets for each batch during training

I am building a simple Sequential model in Keras (tensorflow backend). During training I want to inspect the individual training batches and model predictions. Therefore, I am trying to create a custom Callback that saves the model predictions and targets for each training batch. However, the model is not using the current batch for prediction, but the entire training data.

How can I hand over only the current training batch to the Callback?

And how can I access the batches and targets that the Callback saves in self.predhis and self.targets?

My current version looks as follows:

callback_list = [prediction_history((self.x_train, self.y_train))]  self.model.fit(self.x_train, self.y_train, batch_size=self.batch_size, epochs=self.n_epochs, validation_data=(self.x_val, self.y_val), callbacks=callback_list)  class prediction_history(keras.callbacks.Callback):     def __init__(self, train_data):         self.train_data = train_data         self.predhis = []         self.targets = []      def on_batch_end(self, epoch, logs={}):         x_train, y_train = self.train_data         self.targets.append(y_train)         prediction = self.model.predict(x_train)         self.predhis.append(prediction)         tf.logging.info("Prediction shape: {}".format(prediction.shape))         tf.logging.info("Targets shape: {}".format(y_train.shape)) 
like image 641
Lemon Avatar asked Nov 02 '17 15:11

Lemon


People also ask

How do you save a Keras model for prediction?

There are two formats you can use to save an entire model to disk: the TensorFlow SavedModel format, and the older Keras H5 format. The recommended format is SavedModel. It is the default when you use model.save() .

How do you save a model after each epoch Keras?

Let's say for example, after epoch = 150 is over, it will be saved as model. save(model_1. h5) and after epoch = 152 , it will be saved as model. save(model_2.

How do I use callbacks in Keras?

A callback is an object that can perform actions at various stages of training (e.g. at the start or end of an epoch, before or after a single batch, etc). You can use callbacks to: Write TensorBoard logs after every batch of training to monitor your metrics. Periodically save your model to disk.

How do I save model weights for each epoch?

To save weights every epoch, you can use something known as callbacks in Keras. checkpoint = ModelCheckpoint(.....) , assign the argument 'period' as 1 which assigns the periodicity of epochs. This should do it.


1 Answers

NOTE: this answer is outdated and only works with TF1. Check @bers's answer for a solution tested on TF2.


After model compilation, the placeholder tensor for y_true is in model.targets and y_pred is in model.outputs.

To save the values of these placeholders at each batch, you can:

  1. First copy the values of these tensors into variables.
  2. Evaluate these variables in on_batch_end, and store the resulting arrays.

Now step 1 is a bit involved because you'll have to add an tf.assign op to the training function model.train_function. Using current Keras API, this can be done by providing a fetches argument to K.function() when the training function is constructed.

In model._make_train_function(), there's a line:

self.train_function = K.function(inputs,                                  [self.total_loss] + self.metrics_tensors,                                  updates=updates,                                  name='train_function',                                  **self._function_kwargs) 

The fetches argument containing the tf.assign ops can be provided via model._function_kwargs (only works after Keras 2.1.0).

As an example:

from keras.layers import Dense from keras.models import Sequential from keras.callbacks import Callback from keras import backend as K import tensorflow as tf import numpy as np  class CollectOutputAndTarget(Callback):     def __init__(self):         super(CollectOutputAndTarget, self).__init__()         self.targets = []  # collect y_true batches         self.outputs = []  # collect y_pred batches          # the shape of these 2 variables will change according to batch shape         # to handle the "last batch", specify `validate_shape=False`         self.var_y_true = tf.Variable(0., validate_shape=False)         self.var_y_pred = tf.Variable(0., validate_shape=False)      def on_batch_end(self, batch, logs=None):         # evaluate the variables and save them into lists         self.targets.append(K.eval(self.var_y_true))         self.outputs.append(K.eval(self.var_y_pred))  # build a simple model # have to compile first for model.targets and model.outputs to be prepared model = Sequential([Dense(5, input_shape=(10,))]) model.compile(loss='mse', optimizer='adam')  # initialize the variables and the `tf.assign` ops cbk = CollectOutputAndTarget() fetches = [tf.assign(cbk.var_y_true, model.targets[0], validate_shape=False),            tf.assign(cbk.var_y_pred, model.outputs[0], validate_shape=False)] model._function_kwargs = {'fetches': fetches}  # use `model._function_kwargs` if using `Model` instead of `Sequential`  # fit the model and check results X = np.random.rand(10, 10) Y = np.random.rand(10, 5) model.fit(X, Y, batch_size=8, callbacks=[cbk]) 

Unless the number of samples can be divided by the batch size, the final batch will have a different size than other batches. So K.variable() and K.update() can't be used in this case. You'll have to use tf.Variable(..., validate_shape=False) and tf.assign(..., validate_shape=False) instead.


To verify the correctness of the saved arrays, you can add one line in training.py to print out the shuffled index array:

if shuffle == 'batch':     index_array = _batch_shuffle(index_array, batch_size) elif shuffle:     np.random.shuffle(index_array)  print('Index array:', repr(index_array))  # Add this line  batches = _make_batches(num_train_samples, batch_size) 

The shuffled index array should be printed out during fitting:

 Epoch 1/1 Index array: array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2]) 10/10 [==============================] - 0s 23ms/step - loss: 0.5670 

And you can check if cbk.targets is the same as Y[index_array]:

index_array = np.array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2]) print(Y[index_array]) [[ 0.75325592  0.64857277  0.1926653   0.7642865   0.38901153]  [ 0.77567689  0.13573623  0.4902501   0.42897559  0.55825652]  [ 0.33760938  0.68195038  0.12303088  0.83509441  0.20991668]  [ 0.98367778  0.61325065  0.28973401  0.28734073  0.93399794]  [ 0.26097574  0.88219054  0.87951941  0.64887846  0.41996446]  [ 0.97794604  0.91307569  0.93816428  0.2125808   0.94381495]  [ 0.74813435  0.08036688  0.38094272  0.83178364  0.16713736]  [ 0.52609421  0.39218962  0.21022047  0.58569125  0.08012982]  [ 0.61276627  0.20679494  0.24124858  0.01262245  0.0994412 ]  [ 0.6026137   0.25620512  0.7398164   0.52558182  0.09955769]]  print(cbk.targets) [array([[ 0.7532559 ,  0.64857274,  0.19266529,  0.76428652,  0.38901153],         [ 0.77567691,  0.13573623,  0.49025011,  0.42897558,  0.55825651],         [ 0.33760938,  0.68195039,  0.12303089,  0.83509439,  0.20991668],         [ 0.9836778 ,  0.61325067,  0.28973401,  0.28734073,  0.93399793],         [ 0.26097575,  0.88219053,  0.8795194 ,  0.64887846,  0.41996446],         [ 0.97794604,  0.91307569,  0.93816429,  0.2125808 ,  0.94381493],         [ 0.74813437,  0.08036689,  0.38094273,  0.83178365,  0.16713737],         [ 0.5260942 ,  0.39218962,  0.21022047,  0.58569127,  0.08012982]], dtype=float32),  array([[ 0.61276627,  0.20679495,  0.24124858,  0.01262245,  0.0994412 ],         [ 0.60261369,  0.25620511,  0.73981643,  0.52558184,  0.09955769]], dtype=float32)] 

As you can see, there are two batches in cbk.targets (one "full batch" of size 8 and the final batch of size 2), and the row order is the same as Y[index_array].

like image 185
Yu-Yang Avatar answered Nov 10 '22 12:11

Yu-Yang