Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"[CRITICAL] WORKER TIMEOUT" in logs when running "Hello Cloud Run with Python" from GCP Setup Docs

Following the tutorial here I have the following 2 files:

app.py

from flask import Flask, request

app = Flask(__name__)


@app.route('/', methods=['GET'])
def hello():
    """Return a friendly HTTP greeting."""
    who = request.args.get('who', 'World')
    return f'Hello {who}!\n'


if __name__ == '__main__':
    # Used when running locally only. When deploying to Cloud Run,
    # a webserver process such as Gunicorn will serve the app.
    app.run(host='localhost', port=8080, debug=True)

Dockerfile

# Use an official lightweight Python image.
# https://hub.docker.com/_/python
FROM python:3.7-slim

# Install production dependencies.
RUN pip install Flask gunicorn

# Copy local code to the container image.
WORKDIR /app
COPY . .

# Service must listen to $PORT environment variable.
# This default value facilitates local development.
ENV PORT 8080

# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind 0.0.0.0:$PORT --workers 1 --threads 8 app:app

I then build and run them using Cloud Build and Cloud Run:

PROJECT_ID=$(gcloud config get-value project)
DOCKER_IMG="gcr.io/$PROJECT_ID/helloworld-python"
gcloud builds submit --tag $DOCKER_IMG
gcloud run deploy --image $DOCKER_IMG --platform managed

The code appears to run fine, and I am able to access the app on the given URL. However the logs seem to indicate a critical error, and the workers keep restarting. Here is the log file from Cloud Run after starting up the app and making a few requests in my web browser:

2020-03-05T03:37:39.392Z Cloud Run CreateService helloworld-python ...
2020-03-05T03:38:03.285477Z[2020-03-05 03:38:03 +0000] [1] [INFO] Starting gunicorn 20.0.4
2020-03-05T03:38:03.287294Z[2020-03-05 03:38:03 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
2020-03-05T03:38:03.287362Z[2020-03-05 03:38:03 +0000] [1] [INFO] Using worker: threads
2020-03-05T03:38:03.318392Z[2020-03-05 03:38:03 +0000] [4] [INFO] Booting worker with pid: 4
2020-03-05T03:38:15.057898Z[2020-03-05 03:38:15 +0000] [1] [INFO] Starting gunicorn 20.0.4
2020-03-05T03:38:15.059571Z[2020-03-05 03:38:15 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
2020-03-05T03:38:15.059609Z[2020-03-05 03:38:15 +0000] [1] [INFO] Using worker: threads
2020-03-05T03:38:15.099443Z[2020-03-05 03:38:15 +0000] [4] [INFO] Booting worker with pid: 4
2020-03-05T03:38:16.320286ZGET200 297 B 2.9 s Safari 13  https://helloworld-python-xhd7w5igiq-ue.a.run.app/
2020-03-05T03:38:16.489044ZGET404 508 B 6 ms Safari 13  https://helloworld-python-xhd7w5igiq-ue.a.run.app/favicon.ico
2020-03-05T03:38:21.575528ZGET200 288 B 6 ms Safari 13  https://helloworld-python-xhd7w5igiq-ue.a.run.app/
2020-03-05T03:38:27.000761ZGET200 285 B 5 ms Safari 13  https://helloworld-python-xhd7w5igiq-ue.a.run.app/?who=me
2020-03-05T03:38:27.347258ZGET404 508 B 13 ms Safari 13  https://helloworld-python-xhd7w5igiq-ue.a.run.app/favicon.ico
2020-03-05T03:38:34.802266Z[2020-03-05 03:38:34 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:4)
2020-03-05T03:38:35.302340Z[2020-03-05 03:38:35 +0000] [4] [INFO] Worker exiting (pid: 4)
2020-03-05T03:38:48.803505Z[2020-03-05 03:38:48 +0000] [5] [INFO] Booting worker with pid: 5
2020-03-05T03:39:10.202062Z[2020-03-05 03:39:09 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:5)
2020-03-05T03:39:10.702339Z[2020-03-05 03:39:10 +0000] [5] [INFO] Worker exiting (pid: 5)
2020-03-05T03:39:18.801194Z[2020-03-05 03:39:18 +0000] [6] [INFO] Booting worker with pid: 6

Note the worker timeouts and reboots at the end of the logs. The fact that its a CRITICAL error makes me think it shouldn't be happing. Is this expected behavior? Is this a side effect of the Cloud Run machinery starting and stopping my service as requests come and go?

like image 696
jminardi Avatar asked Mar 05 '20 03:03

jminardi


2 Answers

Cloud Run has scaled down one of your instances, and the gunicorn arbiter is considering it stalled.

You should add --timeout 0 to your gunicorn invocation to disable the worker timeout entirely, it's unnecessary for Cloud Run.

like image 64
Dustin Ingram Avatar answered Nov 20 '22 01:11

Dustin Ingram


i was facing the error [11229] [CRITICAL] WORKER TIMEOUT (pid:11232) on heroku i changed my Procfile to this

web: gunicorn --workers=3 app:app --timeout 200 --log-file -

and it fixed my problem by incresing the --timeout

like image 28
Muhammad Zakaria Avatar answered Nov 20 '22 02:11

Muhammad Zakaria