Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Increase time to run code for Google flexible app engine delaying DeadlineExceededError

I have a single function running on Google App Engine Flexible as part of an API call. The structure is something like this

import externalmod
...
...

@app.route('/calc_here')
def calc:
answer = externalmod.Method()

return answer

The function externalmod is a complicated algorithm (not datastore, not urlfetch, just pure python), which works for every possible case on the desktop but for some input cases on the app engine, when the endpoint is called it gives the following error

{
 "code": 13,
 "message": "BAD_GATEWAY",
 "details": [
  {
   "@type": "type.googleapis.com/google.rpc.DebugInfo",
   "stackEntries": [],
   "detail": "application"
  }
 ]
}

After looking at https://cloud.google.com/appengine/articles/deadlineexceedederrors and the following discussions : How to increase Google App Engine request timer. Default is 60 sec

and https://groups.google.com/forum/#!topic/google-appengine/3TtfJG0I9nA

I realized this is because App engine will stop if any code run is more than 60 seconds. I first tried to do the following according to Should Exception catch DeadlineExceededError exceptions?

from google.appengine.runtime import DeadlineExceededError
try:
   answer = externalmod.Method()
except DeadlineExceededError:
   answer = some_default

but I got the error that there is no module google.appengine

then realizing all the docs are for the standard environment but I am using flexible environment I reckoned this appengine.runtime probably doesn't even exist anymore When I did this:

 try:
   answer = externalmod.Method()
 except :
   answer = some_default

it worked and I start catching some DeadlineExceededErrors. But apparently, I can't always catch DeadlineExceededErrors like this. As sometimes I catch the error and sometimes not. I figured the best way would be to increase the amount of time the code is allowed to run, rather than just catching the exception.

I tried to change the app.yaml file by adding CPU:2 but didn't make any difference.

runtime_config:
python_version: 3
resources:
  cpu: 2
  memory_gb: 4
manual_scaling:
  instances: 1

Maybe this question Taskqueue for long running tasks in FLEXIBLE app engine

also could have a similar answer, but I have no idea what taskqueue is and also I cant queue anything as the critical function I am running is standalone and I don't want to break it down only for some of the cases. It would be easier for me just to increase the 60 s limit. How can I do that?

like image 763
Vipluv Avatar asked Jun 03 '18 15:06

Vipluv


1 Answers

Since i didnt get any answer I kept up the search. I realise many others also have similar issues.

The first thing to note is that GAE flexible environment does not have most of the standard constraints like in standard environment. This means that DeadlineExceededError does not exist since there is no deadline of 60 sec. All the modules and codes run just like they would on any computer since it is all contained inside Docker containers.

https://cloud.google.com/appengine/docs/flexible/python/migrating

Additionally, there is no google.appengine module. Depending on the language being used, all cloud interactions should happen through google.cloud API https://cloud.google.com/apis/docs/overview

Then what could possibly explain this timeout? I checked the logging -logs in the google cloud project console. I saw that the relevant error is actually [CRITICAL] WORKER TIMEOUT which occurred exactly 30 seconds after the function was called. This has got nothing to do with GAE flex but with the server framework. In my case `gunicorn'.

The answer is provided here actually https://serverfault.com/questions/490101/how-to-resolve-the-gunicorn-critical-worker-timeout-error/627746

Basically, using the documentation http://docs.gunicorn.org/en/latest/settings.html#config-file

the only change needed would be in the app.yaml file

where earlier it was

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app

gunicorn workers have a default 30 sec timeout

change this to

entrypoint: gunicorn -t 120 -b :$PORT main:app

here the timeout is 120 seconds, but depending on some trial and error it can be optimised. This however, solved my particular problem of running a code that takes longer than usual

like image 74
Vipluv Avatar answered Nov 10 '22 21:11

Vipluv