TensorFlow Serving for a custom model

Question

I want to use TensorFlow Serving for a custom model (No pre-trained starting point).

I've made it through the pre-Kubernetes part of the TensorFlow Serving tutorial for Inception, using Docker: http://tensorflow.github.io/serving/serving_inception

I understand (roughly) that the Bazel compiling is central to how everything works. But I am trying to understand how the generated predict_pb2 from tensorflow_serving.apis works, so that I can swap in my own custom model.

To be clear, this is what the main in inception_client.py currently looks like:

def main(_):
  host, port = FLAGS.server.split(':')
  channel = implementations.insecure_channel(host, int(port))
  stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
  # Send request
  with open(FLAGS.image, 'rb') as f:
    # See prediction_service.proto for gRPC request/response details.
    data = f.read()
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'inception'
    request.model_spec.signature_name = 'predict_images'
    request.inputs['images'].CopyFrom(
        tf.contrib.util.make_tensor_proto(data, shape=[1]))
    result = stub.Predict(request, 10.0)  # 10 secs timeout
    print(result)

https://github.com/tensorflow/serving/blob/65f50621a192004ab5ae68e75818e94930a6778b/tensorflow_serving/example/inception_client.py#L38-L52

It's hard for me to unpack and debug what predict_pb2.PredictRequest() is doing since it's Bazel-generated. But I would like to re-point this to a totally different, saved model, with its own .pb file, etc.

How can I refer to a different saved model?

Kiril Gorovoy · Accepted Answer

PredictionService, defined here, is the gRPC API service definition which declares what RPC functions the server will respond to. From this proto, bazel/protoc can generate code that will be linked in the server and in the client (predict_pb2 that you mentioned).

The server extends the autogenerated service here and provides an implementation for each function.

Python clients use the provided predict_pb2 and use that to build a request and send an RPC using the right API.

predict_pb2.PredictRequest() is a PredictRequest proto defined here which is the request type for the Predict() API call (see PredictService Proto definition linked above). That part of the code simply builds a request and result = stub.Predict(request, 10.0) is where the request is actually sent.

In order to use a different model, you'd just need to change the ModelSpec's model name to your model. In the example above, the server loaded the iception model with the name "inception", so the client queries it with request.model_spec.name = 'inception'. To use your model instead, you'd just need to change the name to your model name. Note that you'll probably also need to change the signature_name to your custom name or remove it entirely to use the default signature (assuming it's defined).

TensorFlow Serving for a custom model

Tags:

docker

tensorflow

protocol-buffers

grpc

tensorflow-serving

Niels Joaquin

1 Answers

Kiril Gorovoy

Recent Activity

Donate For Us

TensorFlow Serving for a custom model

Tags:

docker

tensorflow

protocol-buffers

grpc

tensorflow-serving

Niels Joaquin

1 Answers

Kiril Gorovoy

Related questions

Recent Activity

Donate For Us