Querying after populate in Mongoose

Tags:

I'm pretty new to Mongoose and MongoDB in general so I'm having a difficult time figuring out if something like this is possible:

Item = new Schema({     id: Schema.ObjectId,     dateCreated: { type: Date, default: Date.now },     title: { type: String, default: 'No Title' },     description: { type: String, default: 'No Description' },     tags: [ { type: Schema.ObjectId, ref: 'ItemTag' }] });  ItemTag = new Schema({     id: Schema.ObjectId,     tagId: { type: Schema.ObjectId, ref: 'Tag' },     tagName: { type: String } });    var query = Models.Item.find({});  query     .desc('dateCreated')     .populate('tags')     .where('tags.tagName').in(['funny', 'politics'])     .run(function(err, docs){        // docs is always empty     });

Is there a better way do this?

Edit

Apologies for any confusion. What I'm trying to do is get all Items that contain either the funny tag or politics tag.

Edit

Document without where clause:

[{      _id: 4fe90264e5caa33f04000012,     dislikes: 0,     likes: 0,     source: '/uploads/loldog.jpg',     comments: [],     tags: [{         itemId: 4fe90264e5caa33f04000012,         tagName: 'movies',         tagId: 4fe64219007e20e644000007,         _id: 4fe90270e5caa33f04000015,         dateCreated: Tue, 26 Jun 2012 00:29:36 GMT,         rating: 0,         dislikes: 0,         likes: 0      },     {          itemId: 4fe90264e5caa33f04000012,         tagName: 'funny',         tagId: 4fe64219007e20e644000002,         _id: 4fe90270e5caa33f04000017,         dateCreated: Tue, 26 Jun 2012 00:29:36 GMT,         rating: 0,         dislikes: 0,         likes: 0      }],     viewCount: 0,     rating: 0,     type: 'image',     description: null,     title: 'dogggg',     dateCreated: Tue, 26 Jun 2012 00:29:24 GMT   }, ... ]

With the where clause, I get an empty array.

778

asked Jul 03 '12 01:07

jschr

2 Answers

With a modern MongoDB greater than 3.2 you can use $lookup as an alternate to .populate() in most cases. This also has the advantage of actually doing the join "on the server" as opposed to what .populate() does which is actually "multiple queries" to "emulate" a join.

So .populate() is not really a "join" in the sense of how a relational database does it. The $lookup operator on the other hand, actually does the work on the server, and is more or less analogous to a "LEFT JOIN":

Item.aggregate(   [     { "$lookup": {       "from": ItemTags.collection.name,       "localField": "tags",       "foreignField": "_id",       "as": "tags"     }},     { "$unwind": "$tags" },     { "$match": { "tags.tagName": { "$in": [ "funny", "politics" ] } } },     { "$group": {       "_id": "$_id",       "dateCreated": { "$first": "$dateCreated" },       "title": { "$first": "$title" },       "description": { "$first": "$description" },       "tags": { "$push": "$tags" }     }}   ],   function(err, result) {     // "tags" is now filtered by condition and "joined"   } )

N.B. The .collection.name here actually evaluates to the "string" that is the actual name of the MongoDB collection as assigned to the model. Since mongoose "pluralizes" collection names by default and $lookup needs the actual MongoDB collection name as an argument ( since it's a server operation ), then this is a handy trick to use in mongoose code, as opposed to "hard coding" the collection name directly.

Whilst we could also use $filter on arrays to remove the unwanted items, this is actually the most efficient form due to Aggregation Pipeline Optimization for the special condition of as $lookup followed by both an $unwind and a $match condition.

This actually results in the three pipeline stages being rolled into one:

   { "$lookup" : {      "from" : "itemtags",      "as" : "tags",      "localField" : "tags",      "foreignField" : "_id",      "unwinding" : {        "preserveNullAndEmptyArrays" : false      },      "matching" : {        "tagName" : {          "$in" : [            "funny",            "politics"          ]        }      }    }}

This is highly optimal as the actual operation "filters the collection to join first", then it returns the results and "unwinds" the array. Both methods are employed so the results do not break the BSON limit of 16MB, which is a constraint that the client does not have.

The only problem is that it seems "counter-intuitive" in some ways, particularly when you want the results in an array, but that is what the $group is for here, as it reconstructs to the original document form.

It's also unfortunate that we simply cannot at this time actually write $lookup in the same eventual syntax the server uses. IMHO, this is an oversight to be corrected. But for now, simply using the sequence will work and is the most viable option with the best performance and scalability.

Addendum - MongoDB 3.6 and upwards

Though the pattern shown here is fairly optimized due to how the other stages get rolled into the $lookup, it does have one failing in that the "LEFT JOIN" which is normally inherent to both $lookup and the actions of populate() is negated by the "optimal" usage of $unwind here which does not preserve empty arrays. You can add the preserveNullAndEmptyArrays option, but this negates the "optimized" sequence described above and essentially leaves all three stages intact which would normally be combined in the optimization.

MongoDB 3.6 expands with a "more expressive" form of $lookup allowing a "sub-pipeline" expression. Which not only meets the goal of retaining the "LEFT JOIN" but still allows an optimal query to reduce results returned and with a much simplified syntax:

Item.aggregate([   { "$lookup": {     "from": ItemTags.collection.name,     "let": { "tags": "$tags" },     "pipeline": [       { "$match": {         "tags": { "$in": [ "politics", "funny" ] },         "$expr": { "$in": [ "$_id", "$$tags" ] }       }}     ]   }} ])

The $expr used in order to match the declared "local" value with the "foreign" value is actually what MongoDB does "internally" now with the original $lookup syntax. By expressing in this form we can tailor the initial $match expression within the "sub-pipeline" ourselves.

In fact, as a true "aggregation pipeline" you can do just about anything you can do with an aggregation pipeline within this "sub-pipeline" expression, including "nesting" the levels of $lookup to other related collections.

Further usage is a bit beyond the scope of what the question here asks, but in relation to even "nested population" then the new usage pattern of $lookup allows this to be much the same, and a "lot" more powerful in it's full usage.

Working Example

The following gives an example using a static method on the model. Once that static method is implemented the call simply becomes:

  Item.lookup(     {       path: 'tags',       query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } }     },     callback   )

Or enhancing to be a bit more modern even becomes:

  let results = await Item.lookup({     path: 'tags',     query: { 'tagName' : { '$in': [ 'funny', 'politics' ] } }   })

Making it very similar to .populate() in structure, but it's actually doing the join on the server instead. For completeness, the usage here casts the returned data back to mongoose document instances at according to both the parent and child cases.

It's fairly trivial and easy to adapt or just use as is for most common cases.

N.B The use of async here is just for brevity of running the enclosed example. The actual implementation is free of this dependency.

const async = require('async'),       mongoose = require('mongoose'),       Schema = mongoose.Schema;  mongoose.Promise = global.Promise; mongoose.set('debug', true); mongoose.connect('mongodb://localhost/looktest');  const itemTagSchema = new Schema({   tagName: String });  const itemSchema = new Schema({   dateCreated: { type: Date, default: Date.now },   title: String,   description: String,   tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }] });  itemSchema.statics.lookup = function(opt,callback) {   let rel =     mongoose.model(this.schema.path(opt.path).caster.options.ref);    let group = { "$group": { } };   this.schema.eachPath(p =>     group.$group[p] = (p === "_id") ? "$_id" :       (p === opt.path) ? { "$push": `$${p}` } : { "$first": `$${p}` });    let pipeline = [     { "$lookup": {       "from": rel.collection.name,       "as": opt.path,       "localField": opt.path,       "foreignField": "_id"     }},     { "$unwind": `$${opt.path}` },     { "$match": opt.query },     group   ];    this.aggregate(pipeline,(err,result) => {     if (err) callback(err);     result = result.map(m => {       m[opt.path] = m[opt.path].map(r => rel(r));       return this(m);     });     callback(err,result);   }); }  const Item = mongoose.model('Item', itemSchema); const ItemTag = mongoose.model('ItemTag', itemTagSchema);  function log(body) {   console.log(JSON.stringify(body, undefined, 2)) } async.series(   [     // Clean data     (callback) => async.each(mongoose.models,(model,callback) =>       model.remove({},callback),callback),      // Create tags and items     (callback) =>       async.waterfall(         [           (callback) =>             ItemTag.create([{ "tagName": "movies" }, { "tagName": "funny" }],               callback),            (tags, callback) =>             Item.create({ "title": "Something","description": "An item",               "tags": tags },callback)         ],         callback       ),      // Query with our static     (callback) =>       Item.lookup(         {           path: 'tags',           query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } }         },         callback       )   ],   (err,results) => {     if (err) throw err;     let result = results.pop();     log(result);     mongoose.disconnect();   } )

Or a little more modern for Node 8.x and above with async/await and no additional dependencies:

const { Schema } = mongoose = require('mongoose'); const uri = 'mongodb://localhost/looktest';  mongoose.Promise = global.Promise; mongoose.set('debug', true);  const itemTagSchema = new Schema({   tagName: String });  const itemSchema = new Schema({   dateCreated: { type: Date, default: Date.now },   title: String,   description: String,   tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }] });  itemSchema.statics.lookup = function(opt) {   let rel =     mongoose.model(this.schema.path(opt.path).caster.options.ref);    let group = { "$group": { } };   this.schema.eachPath(p =>     group.$group[p] = (p === "_id") ? "$_id" :       (p === opt.path) ? { "$push": `$${p}` } : { "$first": `$${p}` });    let pipeline = [     { "$lookup": {       "from": rel.collection.name,       "as": opt.path,       "localField": opt.path,       "foreignField": "_id"     }},     { "$unwind": `$${opt.path}` },     { "$match": opt.query },     group   ];    return this.aggregate(pipeline).exec().then(r => r.map(m =>      this({ ...m, [opt.path]: m[opt.path].map(r => rel(r)) })   )); }  const Item = mongoose.model('Item', itemSchema); const ItemTag = mongoose.model('ItemTag', itemTagSchema);  const log = body => console.log(JSON.stringify(body, undefined, 2));  (async function() {   try {      const conn = await mongoose.connect(uri);      // Clean data     await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove()));      // Create tags and items     const tags = await ItemTag.create(       ["movies", "funny"].map(tagName =>({ tagName }))     );     const item = await Item.create({        "title": "Something",       "description": "An item",       tags      });      // Query with our static     const result = (await Item.lookup({       path: 'tags',       query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } }     })).pop();     log(result);      mongoose.disconnect();    } catch (e) {     console.error(e);   } finally {     process.exit()   } })()

And from MongoDB 3.6 and upward, even without the $unwind and $group building:

const { Schema, Types: { ObjectId } } = mongoose = require('mongoose');  const uri = 'mongodb://localhost/looktest';  mongoose.Promise = global.Promise; mongoose.set('debug', true);  const itemTagSchema = new Schema({   tagName: String });  const itemSchema = new Schema({   title: String,   description: String,   tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }] },{ timestamps: true });  itemSchema.statics.lookup = function({ path, query }) {   let rel =     mongoose.model(this.schema.path(path).caster.options.ref);    // MongoDB 3.6 and up $lookup with sub-pipeline   let pipeline = [     { "$lookup": {       "from": rel.collection.name,       "as": path,       "let": { [path]: `$${path}` },       "pipeline": [         { "$match": {           ...query,           "$expr": { "$in": [ "$_id", `$$${path}` ] }         }}       ]     }}   ];    return this.aggregate(pipeline).exec().then(r => r.map(m =>     this({ ...m, [path]: m[path].map(r => rel(r)) })   )); };  const Item = mongoose.model('Item', itemSchema); const ItemTag = mongoose.model('ItemTag', itemTagSchema);  const log = body => console.log(JSON.stringify(body, undefined, 2));  (async function() {    try {      const conn = await mongoose.connect(uri);      // Clean data     await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove()));      // Create tags and items     const tags = await ItemTag.insertMany(       ["movies", "funny"].map(tagName => ({ tagName }))     );      const item = await Item.create({       "title": "Something",       "description": "An item",       tags     });      // Query with our static     let result = (await Item.lookup({       path: 'tags',       query: { 'tagName': { '$in': [ 'funny', 'politics' ] } }     })).pop();     log(result);       await mongoose.disconnect();    } catch(e) {     console.error(e)   } finally {     process.exit()   }  })()

answered Oct 13 '22 23:10

Neil Lunn

what you are asking for isn't directly supported but can be achieved by adding another filter step after the query returns.

first, .populate( 'tags', null, { tagName: { $in: ['funny', 'politics'] } } ) is definitely what you need to do to filter the tags documents. then, after the query returns you'll need to manually filter out documents that don't have any tags docs that matched the populate criteria. something like:

query.... .exec(function(err, docs){    docs = docs.filter(function(doc){      return doc.tags.length;    })    // do stuff with docs });

answered Oct 13 '22 22:10

aaronheckmann

Related questions
                            
                                How to use a variable as a field name in mongodb-native findOne()?
                            
                                find file with wild card matching
                            
                                Intellij Idea warning - "Promise returned is ignored" with aysnc/await
                            
                                Node.js catch ENOMEM error thrown after spawn
                            
                                How to use global variable in node.js?
                            
                                ReadFile in Base64 Nodejs
                            
                                Mongoose Schema hasn't been registered for model
                            
                                How to generate an MD5 file hash in JavaScript/Node.js?
                            
                                Looking for way to copy files in gulp and rename based on parent directory
                            
                                Cannot enqueue Handshake after invoking quit
                            
                                How can I automatically deploy my app after a git push ( GitHub and node.js)?
                            
                                Cloud Functions for Firebase - Billing account not configured
                            
                                Looping through files in a folder Node.JS
                            
                                Read file from aws s3 bucket using node fs
                            
                                How to install a specific version of Node on Ubuntu?
                            
                                Node.js - get raw request body using Express
                            
                                Babel 7 - ReferenceError: regeneratorRuntime is not defined
                            
                                Node.Js on windows - How to clear console
                            
                                Asynchronous nodejs module exports
                            
                                nextTick vs setImmediate, visual explanation

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Querying after populate in Mongoose

Tags:

node.js

mongodb

mongoose