I am very new to mongodb/pymongo. I have successfully imported my data into mongo and would like to use the group function to group similar row together. For example, if my data set looks like this:
data = [{uid: 1 , event: 'a' , time: 1} ,
{uid: 1 , event: 'b' , time: 2} ,
{uid: 2 , event: 'c' , time: 2} ,
{uid: 3 , event: 'd' , time: 4}
]
How do I use the group function to group the above rows according to the uid field such that the output is as follows?
{ {uid: 1} : [{uid: 1 , event: 'a' , time: 1} , {uid: 1 , event: 'b' , time: 2} ],
{uid: 2} : [{uid: 2 , event: 'c' , time: 2} ],
{uid: 3} : [{uid: 3 , event: 'd' , time: 4} ] }
I read through the examples at http://www.mongodb.org/display/DOCS/Aggregation. However, it seems to me that those example always aggregate into a single number or object.
Thanks,
For the implementation of the phenomenon of groups according to multiple fields, we need to have some data in the database. We will create a database first. This is done by declaring the name of the database with the keyword “use.” For this implementation, we are using a database “demo.”
Mongodb group by multiple fields using Aggregate operation First, the key on which the grouping is based is selected and then the collection is divided into groups according to the selected key value. You can then create a final document by aggregating the documents in each group.
The _id expression specifies the group key. If you specify an _id value of null, or any other constant value, the $group stage returns a single document that aggregates values across all of the input documents. See the Group by Null example.
You needn't use the reduce
function to actually reduce anything. For example:
>>> coll.insert(dict(uid=1,event='a',time=1))
ObjectId('4d5b91d558839f06a8000000')
>>> coll.insert(dict(uid=1,event='b',time=2))
ObjectId('4d5b91e558839f06a8000001')
>>> coll.insert(dict(uid=2,event='c',time=2))
ObjectId('4d5b91f358839f06a8000002')
>>> coll.insert(dict(uid=3,event='d',time=4))
ObjectId('4d5b91fd58839f06a8000003')
>>> result = coll.group(['uid'], None,
{'list': []}, # initial
'function(obj, prev) {prev.list.push(obj)}') # reducer
>>> len(result) # will show three groups
3
>>> int(result[0]['uid'])
1
>>> result[0]['list']
[{u'event': u'a', u'_id': ObjectId('4d5b...0000'), u'uid': 1, u'time': 1},
{u'event': u'b', u'_id': ObjectId('4d5b...0001'), u'uid': 1, u'time': 2}]
>>> int(result[1]['uid'])
2
>>> result[1]['list']
[{u'event': u'c', u'_id': ObjectId('4d5b...0002'), u'uid': 2, u'time': 2}]
>>> int(result[2]['uid'])
3
>>> result[2]['list']
[{u'event': u'd', u'_id': ObjectId('4d5b...0003'), u'uid': 3, u'time': 4}]
I've shortened the object IDs in the above listing to improve readability.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With