Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB fetch documents with sort by count

Tags:

mongodb

I have a document with sub-document which looks something like:

{ 
    "name" : "some name1" 
    "like" : [      
            {  "date" : ISODate("2012-11-30T19:00:00Z") },
            {  "date" : ISODate("2012-12-02T19:00:00Z") },     
            {  "date" : ISODate("2012-12-01T19:00:00Z") },
            {  "date" : ISODate("2012-12-03T19:00:00Z") } 
    ]       
}

Is it possible to fetch documents "most liked" (average value for the last 7 days) and sort by the count?

like image 758
Mirodil Avatar asked Dec 24 '12 16:12

Mirodil


2 Answers

There are a few different ways to solve this problem. The solution I will focus on uses mongodb's aggregation framework. First, here is an aggregation pipeline that will solve your problem, following it will be an explanation/breakdown of what is happening in the command.

db.testagg.aggregate( 
    { $unwind : '$likes' }, 
    { $group : {  _id : '$_id', numlikes : { $sum : 1 }}}, 
    { $sort : { 'numlikes' : 1}})

This pipeline has 3 main commands:

1) Unwind: this splits up the 'likes' field so that there is 1 'like' element per document

2) Group: this regroups the document using the _id field, incrementing the numLikes field for every document it finds. This will cause numLikes to be filled with a number equal to the number of elements that were in "likes" before

3) Sort: Finally, we sort the return values in ascending order based on numLikes. In a test I ran the output of this command is:

{"result" : [
    {
        "_id" : 1,
        "numlikes" : 1
    },
    {
        "_id" : 2,
        "numlikes" : 2
    },
    {
        "_id" : 3,
        "numlikes" : 3
    },
    {
        "_id" : 4,
        "numlikes" : 4
    }....

This is for data inserted via:

for (var i=0; i < 100; i++) {
    db.testagg.insert({_id : i})
    for (var j=0; j < i; j++) {
        db.testagg.update({_id : i}, {'$push' : {'likes' : j}})
    }
}

Note that this does not completely answer your question as it avoids the issue of picking the date range, but it should hopefully get you started and moving in the right direction.

Of course, there are other ways to solve this problem. One solution might be to just do all of the sorting and manipulations client-side. This is just one method for getting the information you desire.

EDIT: If you find this somewhat tedious, there is a ticket to add a $size operator to the aggregation framework, I invite you to watch and potentially upvote it to try and speed to addition of this new operator if you are interested.

https://jira.mongodb.org/browse/SERVER-4899

like image 128
ACE Avatar answered Dec 01 '22 17:12

ACE


A better solution would be to keep a count field that will record how many likes for this document. While you can use aggregation to do this, the performance will likely be not very good. Having a index on the count field will make read operation fast, and you can use atomic operation to increment the counter when inserting new likes.

like image 26
ltfishie Avatar answered Dec 01 '22 17:12

ltfishie