Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google NDB: Best way to read child entities from an entity, repeated property vs regular query?

Let's say i have this really simple parent/child relatiosnship (any Answer class instances always has a Question parent):

class Answer(ndb.Model):
    content = ndb.StringProperty()
    timestamp = ndb.DateTimeProperty()

    def to_message():
        """Returns a protoRPC message object of the answer"""


class Question(ndb.Model):
    content = ndb.StringProperty()
    answers = ndb.KeyProperty(repeated = True, kind = 'Answer')

    def to_message(self):
        """Returns a protoRPC message object of the question"""

The two to message methods are simply used to return a protoRPC object. The question is: in my to_message method, in the Question class, if i want to fetch all child Answer instances, retrieve them, and use their own to_message method to make them into a nice rpc Message, is it better to:

  • Iterate over the anwers repeated KeyProperty list
  • Do a query using a filter on the "parent" property, and iterate over the list it outputs

In terms of NDB access, the first method seems to be the best, but since we're going to go over the free limit anyway, i'm more wondering if the datastore is not more efficient at fetching stuff than i am, iterating over that list.

Edit: The original question has actually a very simple and obvious answer: the first way. The real question would be, in case I have to filter out some Answer entities based on their attributes (for instance timestamp): is it better to query using a filter, or iterate over the list and use a condition to gather only the "interesting" entities?

like image 357
Hadrien Titeux Avatar asked Jul 12 '14 11:07

Hadrien Titeux


2 Answers

With that schema you don't have to query anything because you already have the keys of each answer as a list of keys in question_entity.answers

So you only have to fetch the answers using that keys. Is better if you get all the answers in only one operation.

list_of_answers = ndb.get_multi(question_entity.answers)

(More info at NDB Entities and Keys)

On the other hand, if you model that relationship with a KeyProperty in Answer:

class Answer(ndb.Model):
    question = ndb.KeyProperty(Question)
    content = ndb.StringProperty()
    timestamp = ndb.DateTimeProperty()

def to_message():
    """Returns a protoRPC message object of the answer"""

or with ancestors:

answer = Answer(parent=question_entity.key)

In these cases you should use a normal query for retrieve the answers:

answers = Answer.query(Answer.question == question_entity.key)

or an ancestor query:

answers = Answer.query(ancestor = question_entity.key)

respectively.

This means two jobs: Query the index plus fetching the datastore. In conclusion, in this case the first approach is cheaper for retrieving datastore data.

like image 151
Hernán Acosta Avatar answered Oct 07 '22 15:10

Hernán Acosta


Using ndb.get_multi on the list of keys to fetch the Answers, and then iterating to call their to_message methods will be the most efficient.

like image 44
Greg Avatar answered Oct 07 '22 15:10

Greg