At what rate do indexes "explode" in GAE's big table?
The excerpt from their documentation below explains that for collection values, indexes can "explode" exponentially.
Does this mean that for an object with two collection values, there is an index entry for each subset of values in the first collection paired with each subset in the second collection? Or is there only an index entry for each possible pair of values?
widget:{
mamas_list: ['cookies', 'puppies']
papas_list: ['rain', 'sun']
}
cookies rain
cookies puppies rain
cookies puppies rain sun
cookies sun
cookies rain sun
puppies rain
puppies sun
puppies rain sun
cookies rain
cookies sun
puppies rain
puppies sun
Source: https://developers.google.com/appengine/docs/python/datastore/indexes#Index_Limits
an entity that can have multiple values for the same property requires a separate index entry for each value; again, if the number of possible values is large, such an entity can exceed the entry limit.
The situation becomes worse in the case of entities with multiple properties, each of which can take on multiple values. To accommodate such an entity, the index must include an entry for every possible combination of property values. Custom indexes that refer to multiple properties, each with multiple values, can "explode" combinatorially, requiring large numbers of entries for an entity with only a relatively small number of possible property values. (Taken from: )
Chris,
You'll only have an 'exploding index problem' in cases you explicitly add an index.yaml entry for multiple repeated properties, and when objects saved to the table have too many multiple properties.
In the example, does your index.yaml add this index?
- kind: widget
properties:
- name: mamas_list
- name: papas_list
If you save the sample object to the datastore:
widget(mamas_list=['a', 'b'], papas_list['c', 'd']).put()
There will be 4 different indexes to be saved:
['a', 'c'] ['a', 'd'] ['b', 'c'] ['b', 'd']
The whole purpose of adding this index would be to allow querying by these 2 properties:
widget.query().filter(mamas_list=='a').filter(papas_list=='d').fetch()
You could always avoid an exploding index (not found in this sample case), using the zig-zag algorithm indexes:
http://www.google.com/events/io/2010/sessions/next-gen-queries-appengine.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With