I recently went through this tutorial. I have the trained model from the tutorial and I want to serve it with docker so I can send an arbitrary string of characters to it and get the prediction back from the model.
I also went through this tutorial to understand how to serve with docker. But I didn't comprehend how the model was saved with the ability to accept input parameters. For example:
curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict
How does the half_plus_two
model know what to do with the instances
param?
In the text generation tutorial, there is a method called generate_text
that handles generating predictions.
def generate_text(model, start_string):
# Evaluation step (generating text using the learned model)
# Number of characters to generate
num_generate = 1000
# Converting our start string to numbers (vectorizing)
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
# Empty string to store our results
text_generated = []
# Low temperatures results in more predictable text.
# Higher temperatures results in more surprising text.
# Experiment to find the best setting.
temperature = 1.0
# Here batch size == 1
model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
# remove the batch dimension
predictions = tf.squeeze(predictions, 0)
# using a multinomial distribution to predict the word returned by the model
predictions = predictions / temperature
predicted_id = tf.multinomial(predictions, num_samples=1)[-1,0].numpy()
# We pass the predicted word as the next input to the model
# along with the previous hidden state
input_eval = tf.expand_dims([predicted_id], 0)
text_generated.append(idx2char[predicted_id])
return (start_string + ''.join(text_generated))
How can I serve the trained model from the text generation tutorial and have input parameters to the model api mapped to unique methods such as generate_text
? For example:
curl -d '{"start_string": "ROMEO: "}' \
-X POST http://localhost:8501/v1/models/text_generation:predict
To save weights manually, use tf.keras.Model.save_weights . By default, tf.keras —and the Model.save_weights method in particular—uses the TensorFlow Checkpoint format with a .ckpt extension. To save in the HDF5 format with a .h5 extension, refer to the Save and load models guide.
A SignatureDef defines the signature of a computation supported in a TensorFlow graph. SignatureDefs aim to provide generic support to identify inputs and outputs of a function and can be specified when building a SavedModel.
Note: Answering this completely and extensively would require going in depth on the Serving architecture, its APIs and how they interact with models' signatures. I'll skip over all of this to keep the answer to an acceptable length, but I can always expand on excessively obscure parts if necessary (leave a comment if that's the case).
How does the half_plus_two model know what to do with the instances param?
Because of several unmentioned reasons that pile up to make this a conveniently short example, if only IMO a bit misleading.
1) Where does the instances
parameter come from? The definition of the Predict API for the RESTful API has a predefined request format that, in one of its two possible forms, takes one instances
parameter.
2) What does the instances
parameter map to? We don't know. for SignatureDefs with just one input, instances
in that very specific calling format maps directly to the input without need to specifying the input's key (see section "Specifying input tensors in row format" in the API specs).
So, what happens is: You make a POST request to a model with just one input defined. TF Serving takes that input and feeds it to the model, runs it until it has all the values for the tensors defined in the "outputs" part of the model's signature and returns you a JSON object with key:result
items for each key in the "outputs" list.
How can I serve the trained model from the text generation tutorial and have input parameters to the model api mapped to unique methods such as generate_text?
You can't (not directly mapping a function to a Serving method, at least). The Serving infrastructure exposes some predefined methods (regress
, predict
, classify
) that know how to interpret the signatures to produce the output you requested by running specific subgraphs of the model. These subgraphs must be included in the SavedModel, so for example using tf.py_func
won't work.
Your best chance is to try to describe text generation as a TF subgraph (i.e. using exclusively TF operations) and writing a separate SignatureDef that takes the start string and num_generate
as inputs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With