Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I run long tasks on Google App Engine, which uses gunicorn?

GAE flex uses gunicorn as an entrypoint by default which is fine, except I have a function that takes a very long time to process (scraping websites and story data in a db) and gunicorn times out at 30 seconds by default, then a new worker starts all over on the task, and so on and so forth.

I can set the gunicorn timeout to something like 20 minutes, but it doesn't seem graceful. Is there any way to run these backend functions "outside" of gunicorn, or perhaps a gunicorn config I'm not thinking about? There is no client side, so the long time to complete isn't an issue.

My app.yaml file currently looks like this:

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app

runtime_config:
  python_version: 2

# This sample incurs costs to run on the App Engine flexible environment. 
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your app-with-app-yaml
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 3
  disk_size_gb: 10
like image 320
Joe C. Avatar asked Nov 08 '22 12:11

Joe C.


1 Answers

You can use async worker-class and then you won't need to set the timeout to 20 minutes. The default worker class is sync. Docs regarding the workers here.

Use the eventlet async worker (gevent not recommended if using google client libraries)

pip install eventlet

Then in your gunicorn instantiation set the worker-class = 'eventlet' and set number of workers to [number of cores] x 2 +1 (that's just a recommendation in google docs). For example:

CMD exec gunicorn --worker-class eventlet --workers 3 -b :$PORT main:app

Gunicorn Worker Configuration

Alternatively, use implementation described here using pubsub and workers.

like image 195
Rob Curtis Avatar answered Nov 14 '22 22:11

Rob Curtis