Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimize subqueries with IN queries on AppEngine (python)

Is there any clever way to avoid making a costly query with an IN clause in cases like the following one?

I'm using Google App Engine to build a Facebook application and at some point I (obviously) need to query the datastore to get all the entities that belong to any of the facebook friends of the given user.

Suppose I have a couple of entities modeled as such:

class Thing(db.Model):
    owner = db.ReferenceProperty(reference_class=User, required=True)
    owner_id = db.StringProperty(required=True)
    ...

and

class User(db.Model):
    id = db.StringProperty(required=True)
    ...

At some point I query Facebook to get the list of friends of a given user and I need to perform the following query

# get all Thing instances that belong to friends
query = Thing.all()
query.filter('owner_id IN', friend_ids)

If I did that, AppEngine would perform a subquery for each id in friend_ids, probably exceeding the maximum number of subqueries any query can spawn (30).

Is there any better way to do this (i.e. minimizing the number of queries)? I understand that there are no relations and joins using the datastore but, in particular, I would consider adding new fields to the User or Thing class if it helps in making things easier.

like image 959
abahgat Avatar asked Oct 18 '10 20:10

abahgat


2 Answers

I don't think there's an elegant solution, but you could try this:

On the User model, use Facebook ID as the key name, and store each user's list of things in a ListProperty.

class Thing(db.Model):
  ...

class User(db.Model):
  things = db.ListProperty(db.Key)
  ...

Entity creation would go like this:

user = User.get_or_insert(my_facebook_id)

thing = Thing()
thing.put()

user.things.append(thing.key())
user.put()

Retrieval takes 2 queries:

friends = User.get_by_key_name(friend_ids)
thing_keys = []

for friend in friends:
  thing_keys.extend(friend.things)

things = db.get(thing_keys)
like image 80
Drew Sears Avatar answered Nov 10 '22 06:11

Drew Sears


This Google I/O talk by Brett Slatkin addresses the exact situation you're dealing with. See also his follow up talk this year.

like image 3
Nick Johnson Avatar answered Nov 10 '22 05:11

Nick Johnson