Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get index of an item within mongodb query

I have a query which looks like:

function getPage(page) {
  return db.messages.aggregate(
    {
      '$group': {
        _id: "$subjectID"
      }
    },
    { '$skip': page * 20 },
    { '$limit' : 20 });
}

Say I have a subjectID that I know appears somewhere in that collection. What I want to do is write something like:

function pageOf(subjectID) {
  return Math.floor(db.messages.aggregate(
    {
      '$group': {
        _id: "$subjectID"
      }
    }).indexOf({__id: subjectID}) / 20);
}

Except I have no idea how to write the indexOf part of that query. I was wondering if mongodb might have some sort of "take while" or "take until" query, then you could do that followed by a count of the number of items.

like image 508
ForbesLindesay Avatar asked Dec 06 '25 06:12

ForbesLindesay


1 Answers

Ordering by subjectID

If your subjectID is (or can be changed to) a monotonically increasing value (for example, a MongoDB default ObjectID), you have a straightforward option using a normal find() with appropriate sort, skip, and limit. In this case you can look for documents with subjectIDs $gte (greater than or equal to) your subjectID:

var page = 1;
var subjectID = ObjectId("515535a0760fe8735f5f6897");
db.users.find(
    { _id: { $gte : subjectID } }
).sort({'_id':1}).skip(page*20).limit(20)

Aggregation Framework

As at MongoDb 2.4, there is no such feature in the Aggregation Framework to match based on the document position in the result pipeline. You could file a new feature suggestion the MongoDB Jira project's SERVER queue.

It sounds like you would want a new pipeline operator such as a $matchfrom which would ignore any documents until the first occurrence of the $matchfrom criteria. You could then add a $limit to take the next n items. You would also want to have sorted output before the $matchfrom so there is a predictable outcome.

This seems overcomplicated compared to having an increasing subjectID, but there may be a use case for doing paging based on more advanced search criteria or results calculated in the aggregation pipeline.

Alternative approaches

Aside from future support for such a feature in the Aggregation Framework, you have a few options to implement the same matching approach in code:

  • use the older group() aggregation command with a finalize() function. NOTE: group() does not work with sharded clusters.

  • use MapReduce and a finalize() function

  • fetch the whole result set from the Aggregation Framework, and implement the matching/reduction of results in your application code (though this somewhat defeats the "paging" notion if you are fetching all pages for every request).

Performance considerations

Queries with skip still have to read through the intervening index entries, so skipping a large number of documents will not be very efficient.

Instead of paging with a skip offset, you could consider doing successive page queries by starting from the last entry of the previous page (i.e. the first page would be $gte the starting subjectID and subsequent pages would be $gt the last subjectID included on the previous page). This will depend on how you present the paging in your user interface - it would be easiest to use this approach if your UI only has the option to show "next" page of messages rather than jumping to a specific page.

like image 97
Stennie Avatar answered Dec 08 '25 20:12

Stennie



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!