Can I use populate before aggregate in mongoose?

Q: Which aggregate method is preferred for use by MongoDB?

The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB. The aggregation pipeline can operate on a sharded collection. The aggregation pipeline can use indexes to improve its performance during some of its stages.

Tags:

mongodb

mongodb-query

mongoose

aggregation-framework

I have two models, one is user

 userSchema = new Schema({
     userID: String,
     age: Number
 });

and the other is the score recorded several times everyday for all users

 ScoreSchema = new Schema({
     userID: {type: String, ref: 'User'},
     score: Number,
     created_date = Date,
     ....
 })

I would like to do some query/calculation on the score for some users meeting specific requirement, say I would like to calculate the average of score for all users greater than 20 day by day.

My thought is that firstly do the populate on Scores to populate user's ages and then do the aggregate after that.

Something like

Score.
    populate('userID','age').
    aggregate([
        {$match: {'userID.age': {$gt: 20}}},
        {$group: ...},
        {$group: ...}
    ], function(err, data){});

Is it Ok to use populate before aggregate? Or I first find all the userID meeting the requirement and save them in a array and then use $in to match the score document?

733

asked Jan 31 '16 21:01

SilentCanon

1 Answers

No you cannot call .populate() before .aggregate(), and there is a very good reason why you cannot. But there are different approaches you can take.

The .populate() method works "client side" where the underlying code actually performs additional queries ( or more accurately an $in query ) to "lookup" the specified element(s) from the referenced collection.

In contrast .aggregate() is a "server side" operation, so you basically cannot manipulate content "client side", and then have that data available to the aggregation pipeline stages later. It all needs to be present in the collection you are operating on.

A better approach here is available with MongoDB 3.2 and later, via the $lookup aggregation pipeline operation. Also probably best to handle from the User collection in this case in order to narrow down the selection:

User.aggregate(
    [
        // Filter first
        { "$match": {
            "age": { "$gt": 20 } 
        }},
        // Then join
        { "$lookup": {
            "from": "scores",
            "localField": "userID",
            "foriegnField": "userID",
            "as": "score"
        }},
        // More stages
    ],
    function(err,results) {

    }
)

This is basically going to include a new field "score" within the User object as an "array" of items that matched on "lookup" to the other collection:

{
    "userID": "abc",
    "age": 21,
    "score": [{
        "userID": "abc",
        "score": 42,
        // other fields
    }]
}

The result is always an array, as the general expected usage is a "left join" of a possible "one to many" relationship. If no result is matched then it is just an empty array.

To use the content, just work with an array in any way. For instance, you can use the $arrayElemAt operator in order to just get the single first element of the array in any future operations. And then you can just use the content like any normal embedded field:

        { "$project": {
            "userID": 1,
            "age": 1,
            "score": { "$arrayElemAt": [ "$score", 0 ] }
        }}

If you don't have MongoDB 3.2 available, then your other option to process a query limited by the relations of another collection is to first get the results from that collection and then use $in to filter on the second:

// Match the user collection
User.find({ "age": { "$gt": 20 } },function(err,users) {

    // Get id list      
    userList = users.map(function(user) {
       return user.userID;
    });

    Score.aggregate(
        [ 
            // use the id list to select items
            { "$match": {
                "userId": { "$in": userList }
            }},
            // more stages
        ],
        function(err,results) {

        }
    );

});

So by getting the list of valid users from the other collection to the client and then feeding that to the other collection in a query is the onyl way to get this to happen in earlier releases.

133

answered Sep 28 '22 12:09

Blakes Seven

Related questions
                            
                                Mongo: how to sort by external weight
                            
                                MongoDB: Chain find()
                            
                                How to pull all elements from array in MongoDB without any condition
                            
                                Search Date Range in Meteor
                            
                                allowDiskUse in Aggregation Framework with MongoDB C# Driver
                            
                                Mongoose create connection for multi-tenancy support in node.js
                            
                                MongoDB check if a property exists (and a child property)
                            
                                Check whether insertions were successful (MongoDB C# driver)
                            
                                MongoDB on Wamp x64 (win x64) = mongo.native_long error
                            
                                MongoDB: Sort by array index
                            
                                How to push multiple columns' value within group
                            
                                MongoError: driver is incompatible with this server version
                            
                                How to use mongoimport for specific fileds from tsv file?
                            
                                Mongo db shell variable printing only once
                            
                                folder structure for MEAN stack
                            
                                MongoDb regional replica set - primary node in each region?
                            
                                mongoskin and mongoose dependency error
                            
                                Spring Data MongoDB failed with "in" query
                            
                                Mongodb in Docker: numactl --interleave=all explanation
                            
                                How to filter/find records in MongoDB based on Date

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With