Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google-app-engine NDB iter keys_only

Say I have a query that will be executed often, most likely yielding the same results.

Is it correct that using:

for key in qry.iter(keys_only=True):
    item = key.get()
    #do something with item

Would perform better than:

for item in qry:
    #do something with item

Because in the first example, the query will only load the keys and subsequent calls to key.get() will take advantage of NDB's caching mechanism, whereas example 2 will always fetch the entities from the store? Or have I misunderstood something?

like image 715
Klaus Byskov Pedersen Avatar asked Feb 20 '23 23:02

Klaus Byskov Pedersen


2 Answers

I would doubt that the second form would perform better -- it is always possible that the values are not in the cache, and then, presuming you are getting more than one entity back, you'd be making multiple roundtrips. That quickly gets slower.

A better approach is indeed what's shown in http://code.google.com/p/appengine-ndb-experiment/issues/detail?id=118 -- use ndb.multi_get(q.fetch(keys_only=True)). But even that is worse if your cache hit rate is too low; this is extensively discussed in the issue.

like image 153
Guido van Rossum Avatar answered Feb 28 '23 09:02

Guido van Rossum


AFAIK It will not make any different, because internally, ndb caches everything, including query. If you are going to do other stuff with each one, try async api. that can save valuable time. edit : moreover, if ndb knows query in advance, it can even prefetch them.

I have read this six months back so not sure what is current behavior.

like image 44
iamgopal Avatar answered Feb 28 '23 09:02

iamgopal