Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does appengine implement query on a list efficiently?

From the appengine blog:

Advanced Query Planning - We are removing the need for exploding indexes and reducing the custom index requirements for many queries. The SDK will suggest better indexes in several cases and an upcoming article will describe what further optimizations are possible.

As a test, I have an Entity in appengine that has a listProperty

class Entity(db.Model):
  tags = db.StringListProperty()

I have 500,000 entities, half of them have tags = ['1'], and the other half have tags = ['2']

my query is

SELECT FROM Entity WHERE tags='1' and tags='2'

It returns no results really quickly. What plan is it using to achieve this? How is the list indexed to achieve this? In the old days, an exploding index would have been needed.

like image 381
Alan Avatar asked Nov 05 '22 15:11

Alan


1 Answers

The algorithm used internally ('merge-join') was described in the Google I/O 2009 tech talk Building Scalable, Complex Apps on App Engine. This functionality has also been available since the launch of GAE; the 'exploding indexes' only happen if you create a compound index of multiple StringListProperties.

It's worth noting that this functionality is actually a bit more general than you may realize - any combination of multiple equality filters on any arbitrary combination of properties can be satisfied without any compound indices, provided they're all equality filters and you don't have a sort order. They don't all have to be from a StringListProperty, and can even be split across multiple StringListProperty.

like image 61
bdonlan Avatar answered Nov 11 '22 05:11

bdonlan