Show Model Validation Progress with Keras model.fit()

Question

I am training a CNN model using tf.keras passing training and validation generators as follows:

model.fit(
    x=training_data_generator,
    validation_data=validation_data_generator,
    epochs=n_epochs,
    use_multiprocessing=False,
    max_queue_size=100,
    workers=50
)

The generators are based on tf.keras.Sequence.

The problem is, my data set is huge. Training one epoch takes about a day (despite training on two Titan RTX GPUs) and validation after each epoch takes a few hours.

During training I can see the progress displayed, but during validation all I see is the last snapshot of the training progress bar:

130339/130340 [==============================] - 147432s 1s/step

until the validation finishes and finally I see my validation acuracy, loss etc.

Is there a way to display a progress bar for validation?

I'm thinking of doing something like this:

for epoch in range(n_epochs):
    model.fit(
        x=training_data_generator,
        epochs=1,
        use_multiprocessing=False,
        max_queue_size=100,
        workers=50
    )
    validation_results = model.evaluate(
        x=validation_data_generator,
        use_multiprocessing=False,
        max_queue_size=100,
        workers=50
    )
    print(validation_results)

Another option I was considering is to create a custom callback that validates the model on_epoch_end, but this seems very non-standard.

Is there a better approach to this?

TF_Support · Accepted Answer

You can set a steps_per_epoch on the fit method.
Based on the documentation:
Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors,
the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. If x is a tf.data dataset, and 'steps_per_epoch' is None, the epoch will run until the input dataset is exhausted. This argument is not supported with array inputs.

By this, you can limit the per epoch steps, so setting it with a lower value will immediately give you the validation loss & accuracy per epoch
By setting the steps_per_epoch to a lower size means you need to increase the epoch.

Every 1000 steps or epoch, it will show you the training and validation loss & accuracy after finishing 1000 steps rather than exhausting the entire dataset first then showing the results.

history = model.fit(x_train, y_train,
                    batch_size=2,
                    epochs=30,
                    steps_per_epoch=1000,
                    # We pass some validation for
                    # monitoring validation loss and metrics
                    # at the end of each epoch
                    validation_data=(x_val, y_val))

Show Model Validation Progress with Keras model.fit()

Tags:

python

tensorflow

keras

aL_eX

1 Answers

TF_Support

Recent Activity

Donate For Us

Show Model Validation Progress with Keras model.fit()

Tags:

python

tensorflow

keras

aL_eX

1 Answers

TF_Support

Related questions

Recent Activity

Donate For Us