When saving a checkpoint, TensorFlow often saves a meta file: my_model.ckpt.meta
. What is in that file, can we still restore a model even if we delete it and what kind of info did we lose if we restore a model without the meta file?
The Checkpoint file is a VSAM KSDS that contains checkpoint information generated by the DTF during execution of a copy operation. The Checkpoint file consists of variable length records, one per Process that has checkpointing specified. The average record length is 256 bytes.
ModelCheckpoint callback is used in conjunction with training using model. fit() to save a model or weights (in a checkpoint file) at some interval, so the model or weights can be loaded later to continue the training from the state saved.
This file contains a serialized MetaGraphDef
protocol buffer. The MetaGraphDef
is designed as a serialization format that includes all of the information required to restore a training or inference process (including the GraphDef
that describes the dataflow, and additional annotations that describe the variables, input pipelines, and other relevant information). For example, the MetaGraphDef
is used by TensorFlow Serving to start an inference service based on your trained model. We are investigating other tools that could use the MetaGraphDef
for training.
Assuming that you still have the Python code for your model, you do not need the MetaGraphDef
to restore the model, because you can reconstruct all of the information in the MetaGraphDef
by re-executing the Python code that builds the model. To restore from a checkpoint, you only need the checkpoint files that contain the trained weights, which are written periodically to the same directory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With