Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google App Engine - getting count of records that match criteria over 1000

I've read in multiple locations that GAE lifted the 1000 record limit on queries and counts, however, I can only seem to get a count of the records up to 1000. I won't be pulling more than 1000 queries at a time, but the requirements are such that I need a count of the matching records.

I understand you can use cursors to "paginate" through the dataset, but to cycle through just to get a count seems a bit much. Presumably when they said they "lifted" the limit, it was the hard limit - you still need to cycle through the results 1000 at a time, am I correct?

Should I be using a method other than the .all()/filter method to generate 1000+ counts?

Thanks in advance for all your help!

like image 690
etc Avatar asked Sep 26 '10 19:09

etc


1 Answers

The behavior of Query.count() is inconsistent with the documentation when no limit is explicitly specified - the documentation indicates that it will count "until it finishes counting or times out." GAE Issue 3671 reported this bug (about 3 weeks ago).

The workaround: explicitly specify a limit and then that value will be used (rather than the default of 1,000).

Testing on http://shell.appspot.com demonstrates this:

# insert 1500 TestModel entites ...
# ...
>>> TestModel.all(keys_only=True).count()
1000L
>>> TestModel.all(keys_only=True).count(10000)
1500L

I also see the same behavior on the latest version of the development server (1.3.7) using this simple test app:

from google.appengine.ext import webapp, db
from google.appengine.ext.webapp.util import run_wsgi_app

class Blah(db.Model): pass

class MainPage(webapp.RequestHandler):
    def get(self):
        for i in xrange(3):
            db.put([Blah() for i in xrange(500)])  # can only put 500 at a time ...
        c = Blah.all().count()
        c10k = Blah.all().count(10000)
        self.response.out.write('%d %d' % (c,c10k))
        # prints "1000 1500" on its first run

application = webapp.WSGIApplication([('/', MainPage)])

def main(): run_wsgi_app(application)
if __name__ == '__main__': main()
like image 139
David Underhill Avatar answered Sep 24 '22 19:09

David Underhill