Say I have a query that will be executed often, most likely yielding the same results.
Is it correct that using:
for key in qry.iter(keys_only=True):
item = key.get()
#do something with item
Would perform better than:
for item in qry:
#do something with item
Because in the first example, the query will only load the keys and subsequent calls to key.get()
will take advantage of NDB's caching mechanism, whereas example 2 will always fetch the entities from the store? Or have I misunderstood something?
I would doubt that the second form would perform better -- it is always possible that the values are not in the cache, and then, presuming you are getting more than one entity back, you'd be making multiple roundtrips. That quickly gets slower.
A better approach is indeed what's shown in http://code.google.com/p/appengine-ndb-experiment/issues/detail?id=118 -- use ndb.multi_get(q.fetch(keys_only=True)). But even that is worse if your cache hit rate is too low; this is extensively discussed in the issue.
AFAIK It will not make any different, because internally, ndb caches everything, including query. If you are going to do other stuff with each one, try async api. that can save valuable time. edit : moreover, if ndb knows query in advance, it can even prefetch them.
I have read this six months back so not sure what is current behavior.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With