Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to build a custom DatastoreInputReader based off a query?

How I would go about building a DataStoreInputReader that is based off a query (instead of reading every single entity of that type). The rationale being to minimize reads (since the query is indexed to a subset), and the processing time.

  1. First, is this a good idea? Or would there be actual time and processing savings in using a query-backed custom datastoreinputreader or would the query itself cancel mapreduce parallelism or add other overhead?

  2. Second, how to do it? I have been reading the *input_readers.py* and it's not clear how to subclass the AbstractDataStoreInputReader to do this. Perhaps someone can explain the process for implementing something like this, as it's not clear from reading the code (and documentation is outdated or inexistent).

Brownie points for those who can point to working code (github or others) that show custom datastoreinputreader implementations.

This would be huge in making AppEngine MapReduce more developer accessible or friendly ;-)

like image 222
Johnny Wong Avatar asked May 19 '26 17:05

Johnny Wong


1 Answers

http://code.google.com/p/appengine-mapreduce/source/browse/trunk/python/src/mapreduce/input_readers.py DatastoreInputReader did support filters now!

like image 124
lucemia Avatar answered May 22 '26 07:05

lucemia