Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the NDB membership query ("IN" operation) performance degrade with lots of possible values?

The documentation for the IN query operation states that those queries are implemented as a big OR'ed equality query:

qry = Article.query(Article.tags.IN(['python', 'ruby', 'php']))

is equivalent to:

qry = Article.query(ndb.OR(Article.tags == 'python',
                           Article.tags == 'ruby',
                           Article.tags == 'php'))

I am currently modelling some entities for a GAE project and plan on using these membership queries with a lot of possible values:

qry = Player.query(Player.facebook_id.IN(list_of_facebook_ids))

where list_of_facebook_ids could have thousands of items.

Will this type of query perform well with thousands of possible values in the list? If not, what would be the recommended approach for modelling this?

like image 516
Pascal Bourque Avatar asked Aug 13 '12 14:08

Pascal Bourque


1 Answers

This won't work with thousands of values (in fact I bet it starts degrading with more than 10 values). The only alternative I can think of are some form of precomputation. You'll have to change your schema.

like image 151
Guido van Rossum Avatar answered Oct 16 '22 10:10

Guido van Rossum