Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

most efficient way to get, modify and put a batch of entities with ndb

in my app i have a few batch operations i perform. unfortunately this sometimes takes forever to update 400-500 entities. what i have is all the entity keys, i need to get them, update a property and save them to the datastore and saving them can take up to 40-50 seconds which is not what im looking for.

ill simplify my model to explain what i do (which is pretty simple anyway):

class Entity(ndb.Model):
    title = ndb.StringProperty()

keys = [key1, key2, key3, key4, ..., key500]

entities = ndb.get_multi(keys)

for e in entities:  
    e.title = 'the new title'

ndb.put_multi(entities)

getting and modifying does not take too long. i tried to get_async getting in a tasklet and whatever else is possible which only changes if the get or the forloop takes longer.

but what really bothers me is that a put takes up to 50seconds...

what is the most efficient way to do this operation(s) in a decent amount of time. of course i know that it depends on many factors like the complexity of the entity but the time it takes to put is really over the acceptable limit to me.
i already tried async operations, tasklets...

like image 689
aschmid00 Avatar asked Apr 16 '12 20:04

aschmid00


1 Answers

I wonder if doing smaller batches of e.g. 50 or 100 entities will be faster. If you make that into a task let you can try running those tasklets concurrently.

I also recommend looking at this with Appstats to see if that shows something surprising.

Finally assuming this uses the HRD you may find that there is a limit on the number of entity groups per batch. This limit defaults very low. Try raising it.

like image 190
Guido van Rossum Avatar answered Nov 12 '22 14:11

Guido van Rossum