Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to order NDB query by the key?

I try to use task queues on Google App Engine. I want to utilize the Mapper class shown in the App Engine documentation "Background work with the deferred library". I get an exception on the ordering of the query result by the key

def get_query(self):
    ...
    q = q.order("__key__")
    ...

Exception:

File "C:... mapper.py", line 41, in get_query
    q = q.order("__key__")
  File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\query.py", line 1124, in order
    'received %r' % arg)
TypeError: order() expects a Property or query Order; received '__key__'
INFO     2017-03-09 11:56:32,448 module.py:806] default: "POST /_ah/queue/deferred HTTP/1.1" 500 114

The article is from 2009, so I guess something might have changed. My environment: Windows 7, Python 2.7.9, Google App Engine SDK 1.9.50

There are somewhat similar questions about ordering in NDB on SO. What bugs me this code is from the official doc, presumably updated in Feb 2017 (recently) and posted by someone within top 0.1 % of SO users by reputation.

So I must be doing something wrong. What is the solution?

like image 430
Michael Avatar asked Nov 08 '22 01:11

Michael


1 Answers

Bingo. Avinash Raj is correct. If it were an answer I'd accept it. Here is the full class code

#!/usr/bin/python2.7
# -*- coding: utf-8 -*-
from google.appengine.ext import deferred
from google.appengine.ext import ndb
from google.appengine.runtime import DeadlineExceededError
import logging

class Mapper(object):
    """
    from https://cloud.google.com/appengine/docs/standard/python/ndb/queries
    corrected with suggestions from Stack Overflow
    http://stackoverflow.com/questions/42692319/how-to-order-ndb-query-by-the-key
    """
    # Subclasses should replace this with a model class (eg, model.Person).
    KIND = None

    # Subclasses can replace this with a list of (property, value) tuples to filter by.
    FILTERS = []

    def __init__(self):
        logging.info("Mapper.__init__: {}")
        self.to_put = []
        self.to_delete = []

    def map(self, entity):
        """Updates a single entity.
        Implementers should return a tuple containing two iterables (to_update, to_delete).
        """
        return ([], [])

    def finish(self):
        """Called when the mapper has finished, to allow for any final work to be done."""
        pass

    def get_query(self):
        """Returns a query over the specified kind, with any appropriate filters applied."""
        q = self.KIND.query()
        for prop, value in self.FILTERS:
            q = q.filter(prop == value)
        if __name__ == '__main__':
            q = q.order(self.KIND.key) # the fixed version. The original q.order('__key__') failed
            # see http://stackoverflow.com/questions/42692319/how-to-order-ndb-query-by-the-key
        return q

    def run(self, batch_size=100):
        """Starts the mapper running."""
        logging.info("Mapper.run: batch_size: {}".format(batch_size))
        self._continue(None, batch_size)

    def _batch_write(self):
        """Writes updates and deletes entities in a batch."""
        if self.to_put:
            ndb.put_multi(self.to_put)
            self.to_put = []
        if self.to_delete:
            ndb.delete_multi(self.to_delete)
            self.to_delete = []

    def _continue(self, start_key, batch_size):
        q = self.get_query()
        # If we're resuming, pick up where we left off last time.
        if start_key:
            key_prop = getattr(self.KIND, '_key')
            q = q.filter(key_prop > start_key)
        # Keep updating records until we run out of time.
        try:
            # Steps over the results, returning each entity and its index.
            for i, entity in enumerate(q):
                map_updates, map_deletes = self.map(entity)
                self.to_put.extend(map_updates)
                self.to_delete.extend(map_deletes)
                # Do updates and deletes in batches.
                if (i + 1) % batch_size == 0:
                    self._batch_write()
                # Record the last entity we processed.
                start_key = entity.key
            self._batch_write()
        except DeadlineExceededError:
            # Write any unfinished updates to the datastore.
            self._batch_write()
            # Queue a new task to pick up where we left off.
            deferred.defer(self._continue, start_key, batch_size)
            return
        self.finish()
like image 180
Michael Avatar answered Nov 14 '22 22:11

Michael