I am trying to serve a machine learning model via an API using Flask's Blueprints, here is my flask __init__.py
file
from flask import Flask
def create_app(test_config=None):
app = Flask(__name__)
@app.route("/healthcheck")
def healthcheck() -> str:
return "OK"
# Registers the machine learning blueprint
from . import ml
app.register_blueprint(ml.bp)
return app
The ml.py
file which contains the blueprint for the /ml
endpoint
import numpy as np
from . import configuration as cfg
import tensorflow as tf
from flask import (
Blueprint, flash, request, url_for
)
bp = Blueprint("ml", __name__, url_prefix="/ml")
keras_model = None
graph = None
@bp.before_app_first_request
def load_model():
print("Loading keras model")
global keras_model
global graph
with open(cfg.config["model"]["path"], 'r') as model_file:
yaml_model = model_file.read()
keras_model = tf.keras.models.model_from_yaml(yaml_model)
graph = tf.get_default_graph()
keras_model.load_weights(cfg.config["model"]["weights"])
@bp.route('/predict', methods=['POST'])
def predict() -> str:
global graph
features = np.array([request.get_json()['features']])
print(features, len(features), features.shape)
with graph.as_default():
prediction = keras_model.predict(features)
print(prediction)
return "%.2f" % prediction
I run the server using a command line script
#!/bin/bash
export FLASK_APP=src
export FLASK_ENV=development
flask run
And if I go to localhost:5000/healthcheck
I get the OK
response as I should, when I run the following curl
curl -X POST \
http://localhost:5000/ml/predict \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"features" : [17.0, 0, 0, 12.0, 1, 0, 0]
}'
For the first time, I get the response [[1.00]]
, if I run it again I get the following error
tensorflow.python.framework.errors_impl.FailedPreconditionError:
Error while reading resource variable dense/kernel from
Container: localhost. This could mean that the variable was uninitialized.
Not found: Container localhost does not exist. (Could not find resource: localhost/dense/kernel)
[[{{node dense/MatMul/ReadVariableOp}}]]
If I modify the Blueprint file the server will detect the changes and refresh it, I can call the API again and it will return the correct result for the first call and I am back to the error again. Why does this happen? And why only for the calls after the first one?
You can try creating a reference to the session that is used for loading the models and then to set it to be used by keras in each request. i.e. do the following:
from tensorflow.python.keras.backend import set_session
from tensorflow.python.keras.models import load_model
tf_config = some_custom_config
sess = tf.Session(config=tf_config)
graph = tf.get_default_graph()
# IMPORTANT: models have to be loaded AFTER SETTING THE SESSION for keras!
# Otherwise, their weights will be unavailable in the threads after the session there has been set
set_session(sess)
model = load_model(...)
and then in each request:
global sess
global graph
with graph.as_default():
set_session(sess)
model.predict(...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With