Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improve App Engine performance by reducing entity size

The objective is to reduce the CPU cost and response time for a piece of code that runs very often and must db.get() several hundred keys each time.

Does this even work?

Can I expect the API time of a db.get() with several hundred keys to reduce roughly linearly as I reduce the size of the entity? Currently the entity has the following data attached: 9 String, 9 Boolean, 8 Integer, 1 GeoPt, 2 DateTime, 1 Text (avg size ~100 bytes FWIW), 1 Reference, 1 StringList (avg size 500 bytes). The goal is to move the vast majority of this data to related classes so that the core fetch of the main model will be quick.

If it does work, how is it implemented?

After a refactor, will I still incur the same high cost fetching existing entities? The documentation says that all properties of a model are fetched simultaneously. Will the old unneeded properties still transfer over RPC on my dime and while users wait? In other words: if I want to reduce the load time of my entities, is it necessary to migrate the old entities to ones with the new definition? If so, is it sufficient to re-put() the entity, or must I save under a wholly new key?

Example

Consider:

class Thing(db.Model):
    text    = db.TextProperty()
    strings = db.StringListProperty()
    num     = db.IntegerProperty()

thing = Thing(key_name='thing1', text='x' * 10240,
      strings = ['y'*500 for i in range(10)], num=23)
thing.put()

Let's say I re-define Thing to be streamlined and push up a new version:

class Thing(db.Model):
    num = db.IntegerProperty()

And I fetch it again:

thing_again = Thing.get_by_key_name('thing1')

Have I reduced the fetch time for this entity?

like image 295
JasonSmith Avatar asked Oct 10 '09 11:10

JasonSmith


2 Answers

To answer your questions in order:

  • Yes, splitting up your model will reduce the fetch time, though probably not linearly. For a relatively small model like yours, the differences may not be huge. Large list properties are the leading cause of increased fetch time.
  • Old properties will still be transferred when you fetch an entity after the change to the model, because the datastore has no knowledge of models.
  • Also, however, deleted properties will still be stored even once you call .put(). Currently, there's two ways to eliminate the old properties: Replace all the existing entities with new ones, or use the lower-level api.datastore interface, which is dict-like and makes it easy to delete keys.
like image 73
Nick Johnson Avatar answered Nov 01 '22 08:11

Nick Johnson


To remove properties from an entity, you can change your Model to an Expando, and then use delattr. It's documented in the App Engine docs here:

http://code.google.com/intl/fr/appengine/articles/update_schema.html

Under the heading "Removing Deleted Properties from the Datastore"

like image 35
Danny Tuppeny Avatar answered Nov 01 '22 07:11

Danny Tuppeny