I have a db set up in mongo that I'm accessing with pymongo.
I'd like to be able to pull a small set of fields into a list of dictionaries. So, something like what I get in the mongo shell when I type...
db.find({},{"variable1_of_interest":1, "variable2_of_interest":1}).limit(2).pretty()
I'd like a python statement like:
x = db.find({},{"variable1_of_interest":1, "variable2_of_interest":1})
where x is an array structure of some kind rather than a cursor---that is, instead of iterating, like:
data = []
x = db.find({},{"variable1_of_interest":1, "variable2_of_interest":1})
for i in x:
data.append(x)
Is it possible that I could use MapReduce to bring this into a one-liner? Something like
db.find({},{"variable1_of_interest":1, "variable2_of_interest":1}).map_reduce(mapper, reducer, "data")
I intend to output this dataset to R for some analysis, but I'd like concentrate the IO in Python.
You don't need to call mapReduce, you just turn the cursor into a list like so:
>>> data = list(col.find({},{"a":1,"b":1,"_id":0}).limit(2))
>>> data
[{u'a': 1.0, u'b': 2.0}, {u'a': 2.0, u'b': 3.0}]
where col is your db.collection object.
But caution with large/huge result cause every thing is loaded into memory.
What you can do is to call mapReduce in pymongo and pass it the find query as an argument, it could be like this:
db.yourcollection.Map_reduce(map_function, reduce_function,query='{}')
About the projections I think that you would need to do them in the reduce function since query only specify the selection criteria as it says in the mongo documentation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With