Looking for some guidance from people with practical GCR experience. How do you get on with this? I run a Docker container (approx. 670mb in size) in Google Cloud Run, inside is my Python server based on Flask and it is currently ran by this command in the Dockerfile:
CMD exec gunicorn --bind 0.0.0.0:8080 --reload --workers=1 --threads 8 --timeout 0 "db_app.app:create_app()"
Say I will need to serve about 300 requests per hour.
How many workers, threads, should I specify in my exec command to use the GCR's capabilities most effectively?
For example basic configuration of GCR server is something like 1 CPU 1gb of RAM.
So how should I set my Gunicorn there? Maybe I should also use --preload
? specify worker-connections
?
As Dustin cited in his answer (see below), official Google docs suggest to write this in the Dockerfile:
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app
I've no idea about how many cores they have on that "1 CPU" in the GCR configuration, so I doubt this example code is very accurate, it's more likely to be there to just demonstrate how it works in general. So I would be (and everyone in my situation would) very grateful if someone who has a working Gunicorn server packed into a container in Google Cloud Run could share some info about how to properly configure it - basically what to put into this Dockerfile CMD
line instead of the generic example code? Something more real-life-proof.
I think this is a software problem, cuz we're talking about writing things in Dockerfile (question was closed and marked as "not SO scope question").
The script runs fine without need a gunicorn. In the Command line: i usally do: docker run -it -p 8080:8080 my_image_name and then docker will start and listen.
The guidance from Google is the following configuration:
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app
Using --preload
may reduce cold start times, but it also may lead to unexpected behavior, which is largely dependent on how your application is structured.
You should not use --reload
in production.
You should also bind to $PORT
and not hard-code 8080
as the port.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With