I am serving a model trained using object detection API. Here is how I did it: <ul> <li>Create a Tensorflow service on port 9000 as described in the basic tutorial</li> <li>Create a python code calling this service using predict_pb2 from tensorflow_serving.apis similar to this</li> <li>Call this code inside a Flask server to make the service available with HTTP</li> </ul> Still, I could have done things much easier the following way : <ul> <li>Create a python code for inference like in the example in object detection repo </li> <li>Call this code inside a Flask server to make the service available with HTTP</li> </ul> As you can see, I could have skipped the use of Tensorflow serving. So, is there any good reason to use Tensorflow serving in my case ? If not, what are the cases where I should use it ?

I believe most of the reasons why you would prefer Tensorflow Serving over Flask are related to performance: <ul> <li>Tensorflow Serving makes use of gRPC and Protobuf while a regular Flask web service uses REST and JSON. JSON relies on HTTP 1.1 while gRPC uses HTTP/2 (there are important differences). In addition, Protobuf is a binary format used to serialize data and <a href="https://auth0.com/blog/beating-json-performance-with-protobuf/" rel="noreferrer">it is more efficient than JSON</a>.</li> <li>TensorFlow Serving can batch requests to the same model, which uses hardware (e.g. GPUs) more appropriate.</li> <li>TensorFlow Serving can manage model versioning</li> </ul> As almost everything, it depends a lot on the use case you have and your scenario, so it's important to think about pros and cons and your requirements. TensorFlow Serving has great features, but these features could be also implemented to work with Flask with some effort (for instance, you could create your batch mechanism).

Flask is used to handle request/response whereas Tensorflow serving is particularly built for serving flexible ML models in production. Let's take some scenarios where you want to: <ul> <li>Serve multiple models to multiple products (Many to Many relations) at the same time. </li> <li>Look which model is making an impact on your product (A/B Testing). </li> <li>Update model weights in production, which is as easy as saving a new model to a folder.</li> <li>Have a performance equal to code written in C/C++.</li> </ul> And you can always use all those advantages for FREE by sending requests to TF Serving using Flask.

Tensorflow Serving: When to use it rather than simple inference inside Flask service?

2 Answers

I believe most of the reasons why you would prefer Tensorflow Serving over Flask are related to performance:

Tensorflow Serving makes use of gRPC and Protobuf while a regular Flask web service uses REST and JSON. JSON relies on HTTP 1.1 while gRPC uses HTTP/2 (there are important differences). In addition, Protobuf is a binary format used to serialize data and it is more efficient than JSON.
TensorFlow Serving can batch requests to the same model, which uses hardware (e.g. GPUs) more appropriate.
TensorFlow Serving can manage model versioning

As almost everything, it depends a lot on the use case you have and your scenario, so it's important to think about pros and cons and your requirements. TensorFlow Serving has great features, but these features could be also implemented to work with Flask with some effort (for instance, you could create your batch mechanism).

147

answered Oct 17 '22 18:10

Thomas Paula

Flask is used to handle request/response whereas Tensorflow serving is particularly built for serving flexible ML models in production.

Let's take some scenarios where you want to:

Serve multiple models to multiple products (Many to Many relations) at the same time.
Look which model is making an impact on your product (A/B Testing).
Update model weights in production, which is as easy as saving a new model to a folder.
Have a performance equal to code written in C/C++.

And you can always use all those advantages for FREE by sending requests to TF Serving using Flask.

answered Oct 17 '22 17:10

prashanth basani

Related questions
                            
                                how to convert a bs4.element.ResultSet to strings? Python
                            
                                Why does a function that returns itself max out recursion in python 3
                            
                                Chi squared test in Python
                            
                                Pandas time series time between events
                            
                                Run a chord callback even if the main tasks fail
                            
                                Is there a pythonic way to skip decoration on a subclass' method?
                            
                                How does pandas calculate skew
                            
                                Python Pandas, create empty DataFrame specifying column dtypes
                            
                                Difference between self.request and request in Django class-based view
                            
                                Pycharm expected type 'optional[bytes]' got 'str' instead
                            
                                Difference between numpy.float and numpy.float64
                            
                                Django app defaults?
                            
                                What is the fastest way to get an arbitrary element out of a Python dictionary?
                            
                                pandas read excel values not formulas
                            
                                Exact equivalent of `b'...'.decode("utf-8", "backslashreplace")` in Python 2
                            
                                Python parse text from multiple txt file
                            
                                How to use matplotlib to plot pyspark sql results
                            
                                Blank line before the return statement in a Python function
                            
                                Use center in pandas rolling when using a time-series
                            
                                Pandas Pivot Table manually sort columns [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow Serving: When to use it rather than simple inference inside Flask service?

Tags:

python

flask

tensorflow

tensorflow-serving

Aloïs de La Comble

People also ask

2 Answers

Thomas Paula

prashanth basani

Recent Activity

Donate For Us