Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using yield with multiple ndb.get_multi_async

I am trying to improve efficiency of my current query from appengine datastore. Currently, I am using a synchronous method:

class Hospital(ndb.Model):
      name = ndb.StringProperty()
      buildings= ndb.KeyProperty(kind=Building,repeated=True)
class Building(ndb.Model):
      name = ndb.StringProperty()
      rooms= ndb.KeyProperty(kind=Room,repeated=True)
class Room(ndb.Model):
      name = ndb.StringProperty()
      beds = ndb.KeyProperty(kind=Bed,repeated=True)
class Bed(ndb.Model):
      name = ndb.StringProperty()
      .....

Currently I go through stupidly:

currhosp = ndb.Key(urlsafe=valid_hosp_key).get()
nbuilds = ndb.get_multi(currhosp.buildings)
for b in nbuilds:
   rms = ndb.get_multi(b.rooms)
   for r in rms:
      bds = ndb.get_multi(r.beds)
      for b in bds:
          do something with b object

I would like to transform this into a much faster query using get_multi_async

My difficulty is in how I can do this? Any ideas?

Best Jon

like image 678
Jon Avatar asked Jan 11 '13 00:01

Jon


2 Answers

using the given structures above, it is possible, and was confirmed that you can solve this with a set of tasklets. It is a SIGNIFICANT speed up over the iterative method.

@ndb.tasklet
def get_bed_info(bed_key):
    bed_info = {}
    bed = yield bed_key.get_async()
    format and store bed information into bed_info
    raise ndb.Return(bed_info)

@nbd.tasklet
def get_room_info(room_key):
    room_info = {}
    room = yield room_key.get_async()
    beds = yield map(get_bed_info,room.beds)
    store room info in room_info
    room_info["beds"] = beds
    raise ndb.Return(room_info)

@ndb.tasklet
def get_building_info(build_key):
    build_info = {}
    building = yield build_key.get_async()
    rooms = yield map(get_room_info,building.rooms)
    store building info in build_info
    build_info["rooms"] = rooms
    raise ndb.Return(build_info)

@ndb.toplevel
def get_hospital_buildings(hospital_object):
    buildings = yield map(get_building_info,hospital_object.buildings)
    raise ndb.Return(buildings)

Now comes the main call from the hospital function where you have the hospital object (hosp).

hosp_info = {}
buildings = get_hospital_buildings(hospital_obj)
store hospital info in hosp_info
hosp_info["buildings"] = buildings
return hosp_info

There you go! It is incredibly efficient and lets the schedule complete all the information in the fastest possible manner within the GAE backbone.

like image 140
Jon Avatar answered Nov 14 '22 11:11

Jon


You can do something with query.map(). See https://developers.google.com/appengine/docs/python/ndb/async#tasklets and https://developers.google.com/appengine/docs/python/ndb/queryclass#Query_map

like image 3
Guido van Rossum Avatar answered Nov 14 '22 10:11

Guido van Rossum