Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I return data from a deferred task in Google App Engine

Original Question

I have a working version of my web application that I am trying to upgrade at the moment, and I'm running into the issue of having a task which takes too long to complete in during a single HTTP request. The application takes a JSON list from a JavaScript front end by an HTTP Post operation, and returns a sorted/sliced version of that list. As the input list gets longer, the sorting operation takes a much longer time to perform (obviously), so on suitably long input lists, I hit the 60 second HTTP request timeout, and the application fails.

I would like to start using the deferred library to perform the sort task, but I'm not clear on how to store/retrieve the data after I perform that task. Here is my current code:

class getLineups(webapp2.RequestHandler):
  def post(self):
    jsonstring = self.request.body
    inputData = json.loads(jsonstring)
    playerList = inputData["pList"]
    positions = ["QB","RB","WR","TE","DST"]

    playersPos = sortByPos(playerList,positions)
    rosters, playerUse = getNFLRosters(playersPos, positions)
    try:
      # This step is computationally expensive, it will fail on large player lists.
      lineups = makeLineups(rosters,playerUse,50000)

      self.response.headers["Content-Type"] = "application/json"
      self.response.out.write(json.dumps(lineups))
    except:
      logging.error("60 second timeout reached on player list of length:", len(playerList))
      self.response.headers["Content-Type"] = "text/plain"
      self.response.set_status(504)

app = webapp2.WSGIApplication([
  ('/lineup',getLineups),
], debug = True)

Ideally I would like to replace the entire try/except block with a call to the deferred task library:

deferred.defer(makeLineups,rosters,playerUse,50000)

But I'm unclear on how I would get the result back from that operation. I'm thinking I would have to store it in the Datastore, and then retrieve it, but how would my JavaScript front end know when the operation is complete? I've read the documentation on Google's site, but I'm still hazy on how to accomplish this task.

How I Solved It

Using the basic outline in the accepted answer, here's how I solved this problem:

def solveResult(result_key):
  result = result_key.get()

  playersPos = sortByPos(result.playerList, result.positions)
  rosters, playerUse = getNFLRosters(playersPos,result.positions)

  lineups = makeLineups(rosters,playerUse,50000)
  storeResult(result_key,lineups)

@ndb.transactional
def storeResult(result_key,lineups):
  result = result_key.get()
  result.lineups = lineups
  result.solveComplete = True
  result.put()

class Result(ndb.Model):
  playerList = ndb.JsonProperty()
  positions = ndb.JsonProperty()
  solveComplete = ndb.BooleanProperty()

class getLineups(webapp2.RequestHandler):
  def post(self):
    jsonstring = self.request.body
    inputData = json.loads(jsonstring)

    deferredResult = Result(
      playerList = inputData["pList"],
      positions = ["QB","RB","WR","TE","DST"],
      solveComplete = False
    )

    deferredResult_key = deferredResult.put()

    deferred.defer(solveResult,deferredResult_key)

    self.response.headers["Content-Type"] = "text/plain"
    self.response.out.write(deferredResult_key.urlsafe())

class queryResults(webapp2.RequestHandler):
  def post(self):
    safe_result_key = self.request.body
    result_key = ndb.Key(urlsafe=safe_result_key)

    result = result_key.get()
    self.response.headers["Content-Type"] = "application/json"

    if result.solveComplete:
      self.response.out.write(json.dumps(result.lineups))
    else:
      self.response.out.write(json.dumps([]))

The Javascript frontend then polls queryLineups URL for a fixed amount of time and stops polling if either the time limit expires, or it receives data back. I hope this is helpful for anyone else attempting to solve a similar problem. I have a bit more work to do to make it fail gracefully if things get squirrelly, but this works and just needs refinement.

like image 555
Patrick Gray Avatar asked Sep 26 '22 01:09

Patrick Gray


1 Answers

I'm not familiar with GAE, but this is a fairly generic question, so I can give you some advice.

Your general idea is correct, so I'm just going to expand on it. The workflow could look like this:

  1. You get the request to create the lineups. You create a new entity in the datastore for it. It should contain an ID (you'll need it to retrieve the result later) and a status (PENDING|DONE|FAILED). You can also save the data from the request, if that's useful to you.
  2. You defer the computation and return a response right away. The response will contain the ID of the task. When the computation is done, it will save the result of the task in the Datastore and update the status of the task. That result will contain the task ID, so that we can easily find it.
  3. Once the frontend receives the ID, it starts polling for the result. Using setTimeout or setInterval you send requests with the task ID to the server (this is a separate endpoint). The server checks the status of the task, and returns the result if it's done (error if failed).
  4. The frontend gets the data and stops polling.
like image 82
pablochan Avatar answered Oct 31 '22 19:10

pablochan