Suppose you have a Keras model with an optimizer like Adam that you save via save_model
.
If you load the model again with load_model
, does it really load ALL optimizer parameters + weights?
Based on the code of save_model
(Link), Keras saves the config of the optimizer:
f.attrs['training_config'] = json.dumps({
'optimizer_config': {
'class_name': model.optimizer.__class__.__name__,
'config': model.optimizer.get_config()},
which, in the case of Adam for example (Link), is as follows:
def get_config(self):
config = {'lr': float(K.get_value(self.lr)),
'beta_1': float(K.get_value(self.beta_1)),
'beta_2': float(K.get_value(self.beta_2)),
'decay': float(K.get_value(self.decay)),
'epsilon': self.epsilon}
As such, this only saves the fundamental parameters but no per-variable optimizer weights.
However, after dumping the config
in save_model
, it looks like some optimizer weights are saved as well (Link). Unfortunately, I can't really understand if every weight of the optimizer saved.
So if you want to continue training the model in a new session with load_model
, is the state of the optimizer really 100% the same as in the last training session? E.g. in the case of SGD with momentum, does it save all per-variable momentums?
Or in general, does it make a difference in training if you stop and resume training with save/load_model
?
save() saves the weights and the model structure to a single HDF5 file.
Now to save the weights only using the simple way, you just have to call the built-in function save_weights on your model. and train it for a few epochs. This will create a folder named weights_folder and save the weights in Tensorflow native format with the name of my_weights. It is going to have 3 files.
There are two formats you can use to save an entire model to disk: the TensorFlow SavedModel format, and the older Keras H5 format. The recommended format is SavedModel. It is the default when you use model.save() .
Keras recommends to use model. save(). Scikit recommends joblib. After tuning the params with RandomizedSearchCV, you can just use trial_search.
It seem your links don't point to the same lines anymore than they originally pointed to at the time of your question, so I don't know which lines you are referring to.
But the answer is yes, the entire state of the optimizer is saved along with the model. You can see this happening in save_model(). Also if you wish not to save the optimizer weights, you can do so by calling save_model(include_optimizer=False)
.
If you inspect the resulting *.h5 file, for example by means of h5dump | less
, you can see those weights. (h5dump is part of h5utils.)
Therefore saving a model and loading it again later should make no difference in many common cases. However there are exceptions not related to the optimizer. One that comes to my mind right now is an LSTM(stateful=True)
layer which I believe does not save the internal LSTM states when calling save_model()
. There are possibly many more reasons why interrupting the training with save/load might not produce the exact same results as training without interruption. But investigating this maybe makes sense only in the context of concrete code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With