I have a neural network model coded in Keras.
When I run it on my laptop, I have the following output that shows the progress of the model:
Train on 4 samples, validate on 1 samples Epoch 1/1 4/4 [==============================] - 22s 5s/step - loss: 0.2477 - val_loss: 0.2672
However, when I submit this code to a cluster for running, I do not know how many epochs are left, so I would like to save the above output into a file while the model is running.
How can I do that?
We can use the Keras callback keras. callbacks. ModelCheckpoint() to save the model at its best performing epoch.
In R and Python, you can save a model locally or to HDFS using the h2o. saveModel (R) or h2o. save_model (Python) function . This function accepts the model object and the file path.
The model. save() saves the whole architecture, weights and the optimizer state. This command saves the details needed to reconstitute your model.
You can do it with a for loop and model.save
:
import os
PATH_TO_MODELS = 'path to models directory'
TOTAL_EPOCHS = 8 # number of epochs you want to save
for epoch in range(TOTAL_EPOCHS):
model.fit(..., epochs = 1)
save_name = 'model_%sepochs.h5' % str(epoch)
model.save(os.path.join(PATH_TO_MODELS, save_name))
This will save the output model from each epoch into a distinct file, so that you can keep track of them.
At least one way is to add the remote monitor call back and actually view it in real time. I haven't played with this yet but I know it exists and have wanted to.
keras.callbacks.RemoteMonitor(root='http://localhost:9000', path='/publish/epoch/end/', field='data', headers=None, send_as_json=False)
You can find the documentation here.
Another option is to use TensorBoard. Keras also has a callback for that in the link I already provided. Here is the information from TensorFlow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With