Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb: Select the top N rows from each group [duplicate]

I use mongodb for my blog platform, where users can create their own blogs. All entries from all blogs are in an entries collection. The document of an entry looks like:

{
  'blog_id':xxx,
  'timestamp':xxx,
  'title':xxx,
  'content':xxx
}

As the question says, is there any way to select, say, last 3 entries for each blog?

like image 433
Tacaza Avatar asked Jun 27 '11 20:06

Tacaza


2 Answers

You need to first sort the documents in the collection by the blog_id and timestamp fields, then do an initial group which creates an array of the original documents in descending order. After that you can slice the array with the documents to return the first 3 elements.

The intuition can be followed in this example:

db.entries.aggregate([
    { '$sort': { 'blog_id': 1, 'timestamp': -1 } }, 
    {       
        '$group': {
            '_id': '$blog_id',
            'docs': { '$push': '$$ROOT' },
        }
    },
    {
        '$project': {
            'top_three': { 
                '$slice': ['$docs', 3]
            }
        }
    }
])
like image 114
chridam Avatar answered Oct 24 '22 09:10

chridam


The only way to do this in basic mongo if you can live with two things :

  • An additional field in your entry document, let's call it "age"
  • A new blog entry taking an additional update

If so, here's how you do it :

  1. Upon creating a new intro do your normal insert and then execute this update to increase the age of all posts (including the one you just inserted for this blog) :

    db.entries.update({blog_id: BLOG_ID}, {age:{$inc:1}}, false, true)

  2. When querying, use the following query which will return the most recent 3 entries for each blog :

    db.entries.find({age:{$lte:3}, timestamp:{$gte:STARTOFMONTH, $lt:ENDOFMONTH}}).sort({blog_id:1, age:1})

Note that this solution is actually concurrency safe (no entries with duplicate ages).

like image 29
Remon van Vliet Avatar answered Oct 24 '22 09:10

Remon van Vliet