Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to use "group" in pymongo to group similar rows?

I am very new to mongodb/pymongo. I have successfully imported my data into mongo and would like to use the group function to group similar row together. For example, if my data set looks like this:

data = [{uid: 1 , event: 'a' , time: 1} , 
        {uid: 1 , event: 'b' , time: 2} ,
        {uid: 2 , event: 'c' , time: 2} ,
        {uid: 3 , event: 'd' , time: 4}
       ]

How do I use the group function to group the above rows according to the uid field such that the output is as follows?

 { {uid: 1} : [{uid: 1 , event: 'a' , time: 1} , {uid: 1 , event: 'b' , time: 2} ],
   {uid: 2} : [{uid: 2 , event: 'c' , time: 2} ],
   {uid: 3} : [{uid: 3 , event: 'd' , time: 4} ] }

I read through the examples at http://www.mongodb.org/display/DOCS/Aggregation. However, it seems to me that those example always aggregate into a single number or object.

Thanks,

like image 884
defoo Avatar asked Feb 15 '11 23:02

defoo


People also ask

How do I group multiple fields in MongoDB?

For the implementation of the phenomenon of groups according to multiple fields, we need to have some data in the database. We will create a database first. This is done by declaring the name of the database with the keyword “use.” For this implementation, we are using a database “demo.”

Can we use multiple group in MongoDB?

Mongodb group by multiple fields using Aggregate operation First, the key on which the grouping is based is selected and then the collection is divided into groups according to the selected key value. You can then create a final document by aggregating the documents in each group.

What is _ID in Group MongoDB?

The _id expression specifies the group key. If you specify an _id value of null, or any other constant value, the $group stage returns a single document that aggregates values across all of the input documents. See the Group by Null example.


1 Answers

You needn't use the reduce function to actually reduce anything. For example:

>>> coll.insert(dict(uid=1,event='a',time=1))
ObjectId('4d5b91d558839f06a8000000')
>>> coll.insert(dict(uid=1,event='b',time=2))
ObjectId('4d5b91e558839f06a8000001')
>>> coll.insert(dict(uid=2,event='c',time=2))
ObjectId('4d5b91f358839f06a8000002')
>>> coll.insert(dict(uid=3,event='d',time=4))
ObjectId('4d5b91fd58839f06a8000003')
>>> result = coll.group(['uid'], None,
                        {'list': []}, # initial
                        'function(obj, prev) {prev.list.push(obj)}') # reducer
>>> len(result) # will show three groups
3
>>> int(result[0]['uid'])
1
>>> result[0]['list']
[{u'event': u'a', u'_id': ObjectId('4d5b...0000'), u'uid': 1, u'time': 1},
 {u'event': u'b', u'_id': ObjectId('4d5b...0001'), u'uid': 1, u'time': 2}]
>>> int(result[1]['uid'])
2
>>> result[1]['list']
[{u'event': u'c', u'_id': ObjectId('4d5b...0002'), u'uid': 2, u'time': 2}]
>>> int(result[2]['uid'])
3
>>> result[2]['list']
[{u'event': u'd', u'_id': ObjectId('4d5b...0003'), u'uid': 3, u'time': 4}]

I've shortened the object IDs in the above listing to improve readability.

like image 176
Vinay Sajip Avatar answered Oct 26 '22 11:10

Vinay Sajip