I am loading a model in keras with model.load() and am finding that the first prediction is taking more than 10x longer to calculate than follow on predictions, any ideas why this could be occurring or suggestions to make the load-initialise-first prediction cycle speed up would be greatly appreciated.
I am using Tensorflow backend with CPU processing.
Thanks for the help, Denym
predict passes the input vector through the model and returns the output tensor for each datapoint. Since the last layer in your model is a single Dense neuron, the output for any datapoint is a single value. And since you didn't specify an activation for the last layer, it will default to linear activation.
The model config, weights, and optimizer are saved in the SavedModel. Additionally, for every Keras layer attached to the model, the SavedModel stores: * the config and metadata -- e.g. name, dtype, trainable status * traced call and loss functions, which are stored as TensorFlow subgraphs.
Ok so I have found the answer that works for me:
if you are loading many models simultaneously don't use the keras model.load function, save your structure as a json/yaml and the weights as a .h5 and load as per the keras examples.
the model.load function is much quicker when dealing with less than 5 models however load times exponentially increase the more models you simultaneously load.
loading from json and weights from .h5 was 10x faster when loading 100 models simultaneously, and while there is some slow down per model when loading structure and weights method it is linear rather than exponential, making it significantly faster when loading many models at once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With