I am using Mongoose in my Express / React web application and I'm storing data in a Mongo Database.
I store 'songs' in a songs collection and the user has an array containing the ids of the songs he listened to for example.
Then to render what he's been listening to, I have to link the array with song ids with song ids from the songs collection.
I am currently using
song.find({_id: {$in: ids}}).exec(callback)
to fetch all the songs matching the ids in the 'ids' array. The 'ids' array may contain the same id several times if the user listened the song again and again.
The thing is that mongoose returns only once the song corresponding to the id and thus the song is not displayed multiple times. Is there a way I cant tell mongoose to pass to the callback as many object as the id is repeated ?
To sum up:
ids: ['a', 'a', 'a', 'b', 'c']
song.find({_id: {$in: ids}}).exec(callback)
dataPassedToCallback: [songA, songB, songC]
Expecting
dataPassedToCallback: [songA, songA, songA, songB, songC]
To find the first array element that matches a condition:Use the Array. find() method to iterate over the array. Check if each value matches the condition. The find method returns the first array element that satisfies the condition.
In JavaScript, we can use the Array. prototype. find() method to find an object by ID in an array of objects.
Use forEach() to find an element in an array The Array. prototype. forEach() method executes the same code for each element of an array. The code is simply a search of the index Rudolf (🦌) is in using indexOf.
JavaScript Array findIndex() The findIndex() method executes a function for each array element. The findIndex() method returns the index (position) of the first element that passes a test. The findIndex() method returns -1 if no match is found.
There seem to be a couple of possible cases here about what you could be asking.
From the perspective of $in
, MongoDB really looks at this as "shorthand" for an $or
condition so effectively these two statements are the same:
"field": { "$in": ["a", "a", "a", "b", "c"] }
and
"$or": [
{ "field": "a" },
{ "field": "a" },
{ "field": "a" },
{ "field": "b" },
{ "field": "c" }
]
At least in terms of the "documents they select" which is merely the "individual" documents the database actually contains. The $in
is actually a little more optimal here because the query engine can see that the OR is on the "same key", and this saves some cost in the query plan execution.
Also, just to note the actual "query plan execution" which can be viewed with explain()
will actually show the "duplicate" entries are removed anyway:
"filter" : {
"_id" : {
"$in" : [
"a",
"b",
"c"
]
}
},
Notably though the $or
would not actually remove the conditions, which is really just another point of why $in
is more efficient as a query here, and even so the $or
still is not going to get the same matching document more than once.
But from the perspective of "selection", then asking for the same criteria "multiple times" does not result in retrieving "multiple times". The same is true of the order of arguments in that they have no effect on how the order is returned from the database itself. Nor would it really make any sense to actually retrieve "multiple copies" from a "database" perspective since this is basically redundant.
Instead what you are really asking is "I have a list, now I want to substitute those values with documents from the database". That is actually a reasonable ask, and relatively easy to achieve. Your actual implementation really just depends on where you get the data from.
In the case you have a "list" from an external source and want the database objects, then the logical thing to do is return the matching documents and then substitute into your ordered list with the returned documents.
In modern NodeJS environments this is as simple as:
let list = ["a", "a", "a", "b", "c"];
let songs = await Song.find({ "_id": { "$in": list } });
songs = list.map(e => songs.find(s => s._id === e));
And now the songs
list has an entry for each item in list
in the same order but actually with the real database document as returned.
If you are dealing with actual ObjectId
values within _id
, then it's better to "cast" the values in the list and use the ObjectId.equals()
function to compare the "objects":
// of course not "valid" ObjectId here; but
let list = ["a", "a", "a", "b", "c"].map(e => ObjectId(e)); // casting
let songs = await Song.find({ "_id": { "$in": list } });
songs = list.map(e => songs.find(s => s._id.equals(e))); // compare
Without the async/await
keywords enabled by default from NodeJS 8.x releases or enabling explicitly in earlier versions, then standard Promise resolution will do:
// of course not "valid" ObjectId here; but
let list = ["a", "a", "a", "b", "c"].map(e => ObjectId(e)); // casting
Song.find({ "_id": { "$in": list } }).then(songs =>
list.map(e => songs.find(s => s._id.equals(e))) // compare
).then(songs => {
// do something
})
Or with a callback
let list = ["a", "a", "a", "b", "c"].map(e => ObjectId(e)); // casting
Song.find({ "_id": { "$in": list } },(err,songs) => {
songs = list.map(e => songs.find(s => s._id.equals(e))); // compare
})
Note that this is significantly different to "mapping the function" as was mentioned in comment on the question. There really is no point in "asking the database multiple times" when you already have the results returned from "one" request. Therefore doing something like:
let songs = await Promise.all(list.map(_id => Song.findById(_id)));
That's quite horribly redundant and creating additional requests and overhead just for the sake of doing requests. So you would not do that and instead do the "one" request and "re-map" onto the list as that simply makes the most sense.
More to the point of the actual implementation you have though is that this "re-mapping" still really has no place at this level of the API. What should really be happening is "ideally" your "front end" actually makes the request with the "unique" _id
list "only". Then the request is passed through allowing the database to respond and simply return the matching documents. As a workflow:
Front End Back End Front End
--------- ------------ -------
List -> Unique List -> Endpoint => Database => Endpoint -> Doc List -> Remap List
So really from the server "Endpoint" and "Database" perpective the "documents" as returned should be all they handle. This decreases the payload of network traffic in the request by removing all duplicates. Only when processing at the "Front End" when receiving the response of those "three" documents in the sample would you actually "re-map" to the final list containing the duplicate copies.
On the other hand if you are actually using data already contained in a document, then Mongoose already supports this where your "list" is already an array within a document. For example as a document for a SongList
model:
{
"list": ["a", "a", "a", "b", "c"]
}
Calling populate where that "list" is actually a list of references to the Song
model items will return each "copy" and in order that the list in the document is stored with:
SongList.find().populate('list')
The reason for this is .populate()
basically issues that same $in
query anyway, using the arguments found in the "list"
field for the document. Then those query results are actually "mapped" onto that array using what is essentially exactly the same code as demonstrated above.
So if that is your actual use case, this is already "built in" and there is no need to go and do the query yourself:
The following shows an example listing of adding "three" songs and using the same "mapping" techniques as well as showing what populate()
just does automatically
const { Schema, Types: { ObjectId } } = mongoose = require('mongoose');
const { uniq } = require('lodash');
const uri = 'mongodb://localhost/songs';
mongoose.set('debug', true);
mongoose.Promise = global.Promise;
const songSchema = new Schema({
name: String
});
const songListSchema = new Schema({
list: [{ type: Schema.Types.ObjectId, ref: 'Song' }]
});
const Song = mongoose.model('Song', songSchema);
const SongList = mongoose.model('SongList', songListSchema);
const log = data => console.log(JSON.stringify(data, undefined, 2));
(async function() {
try {
const conn = await mongoose.connect(uri);
const db = conn.connections[0].db;
let { version } = await db.command({ "buildInfo": 1 });
version = parseFloat(version.match(new RegExp(/(?:(?!-).)*/))[0]);
await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove()));
let [a,b,c] = await Song.insertMany(['a','b','c'].map(name => ({ name })));
await SongList.create({ list: [ a, a, b, a, c ] });
// populate is basically mapping the list
let popresult = await SongList.find().populate('list');
log({ popresult });
// Using an id list
let list = [a, a, b, a, c].map(e => e._id);
// Use a unique copy for the $in to save bandwidth
let unique = uniq(list);
// Map the result
let songs = await Song.find({ _id: { $in: unique } });
songs = list.map(e => songs.find(s => s._id.equals(e)));
log({ songs })
if ( version >= 3.4 ) {
// Force the server to return copies
let stupid = await Song.aggregate([
{ "$match": { "_id": { "$in": unique } } },
{ "$addFields": {
"copies": {
"$filter": {
"input": {
"$map": {
"input": {
"$zip": {
"inputs": [
{ "$literal": list },
{ "$range": [0, { "$size": { "$literal": list } } ] }
]
}
},
"in": {
"_id": { "$arrayElemAt": [ "$$this", 0 ] },
"idx": { "$arrayElemAt": [ "$$this", 1 ] }
}
}
},
"cond": { "$eq": ["$$this._id", "$_id"] }
}
}
}},
{ "$unwind": "$copies" },
{ "$sort": { "copies.idx": 1 } },
{ "$project": { "copies": 0 } }
]);
log({ stupid })
}
} catch(e) {
console.error(e)
} finally {
process.exit()
}
})()
And this gives you output as follows:
Mongoose: songs.remove({}, {})
Mongoose: songlists.remove({}, {})
Mongoose: songs.insertMany([ { _id: 5b06c2ff373eb00d9610aa6e, name: 'a', __v: 0 }, { _id: 5b06c2ff373eb00d9610aa6f, name: 'b', __v: 0 }, { _id: 5b06c2ff373eb00d9610aa70, name: 'c', __v: 0 } ], {})
Mongoose: songlists.insertOne({ list: [ ObjectId("5b06c2ff373eb00d9610aa6e"), ObjectId("5b06c2ff373eb00d9610aa6e"), ObjectId("5b06c2ff373eb00d9610aa6f"), ObjectId("5b06c2ff373eb00d9610aa6e"), ObjectId("5b06c2ff373eb00d9610aa70") ], _id: ObjectId("5b06c2ff373eb00d9610aa71"), __v: 0 })
Mongoose: songlists.find({}, { fields: {} })
Mongoose: songs.find({ _id: { '$in': [ ObjectId("5b06c2ff373eb00d9610aa6e"), ObjectId("5b06c2ff373eb00d9610aa6f"), ObjectId("5b06c2ff373eb00d9610aa70") ] } }, { fields: {} })
{
"popresult": [
{
"list": [
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6f",
"name": "b",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa70",
"name": "c",
"__v": 0
}
],
"_id": "5b06c2ff373eb00d9610aa71",
"__v": 0
}
]
}
Mongoose: songs.find({ _id: { '$in': [ ObjectId("5b06c2ff373eb00d9610aa6e"), ObjectId("5b06c2ff373eb00d9610aa6f"), ObjectId("5b06c2ff373eb00d9610aa70") ] } }, { fields: {} })
{
"songs": [
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6f",
"name": "b",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa70",
"name": "c",
"__v": 0
}
]
}
Mongoose: songs.aggregate([ { '$match': { _id: { '$in': [ 5b06c2ff373eb00d9610aa6e, 5b06c2ff373eb00d9610aa6f, 5b06c2ff373eb00d9610aa70 ] } } }, { '$addFields': { copies: { '$filter': { input: { '$map': { input: { '$zip': { inputs: [ { '$literal': [Array] }, { '$range': [Array] } ] } }, in: { _id: { '$arrayElemAt': [ '$$this', 0 ] }, idx: { '$arrayElemAt': [ '$$this', 1 ] } } } }, cond: { '$eq': [ '$$this._id', '$_id' ] } } } } }, { '$unwind': '$copies' }, { '$sort': { 'copies.idx': 1 } }, { '$project': { copies: 0 } } ], {})
{
"stupid": [
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6f",
"name": "b",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa6e",
"name": "a",
"__v": 0
},
{
"_id": "5b06c2ff373eb00d9610aa70",
"name": "c",
"__v": 0
}
]
}
This really is not a solution but really more of a post on the subject before somebody else mentions it or something similar.
Falling more under the category of "stupid tricks" is actually forcing the server to return the "copies" of the documents.
let stupid = await Song.aggregate([
{ "$match": { "_id": { "$in": list } } },
{ "$addFields": {
"copies": {
"$filter": {
"input": {
"$map": {
"input": {
"$zip": {
"inputs": [
list,
{ "$range": [0, { "$size": { "$literal": list } } ] }
]
}
},
"in": {
"_id": { "$arrayElemAt": [ "$$this", 0 ] },
"idx": { "$arrayElemAt": [ "$$this", 1 ] }
}
}
},
"cond": { "$eq": ["$$this._id", "$_id"] }
}
}
}},
{ "$unwind": "$copies" },
{ "$sort": { "copies.idx": 1 } },
{ "$project": { "copies": 0 } }
]);
That actually will return all the document "copies" from the server. It does so via the $unwind
on the list output processed with $filter
to keep only those values which match the current document _id
. Multiples will be retained in that array which when processed with $unwind
effectively produces a "copy" of the document for each array entry.
As a bonus we keep the "idx"
of the items in the list via mapping an "index" position into the array via $zip
and $range
The following $sort
will then place the documents in order of how they appear in the input list, just to mimic the Array.map()
which is being done in the code you should be using.
We can then simply $project
to "exclude" that field which was only there as a temporary measure.
All of that said, it's not really a great idea to do such a thing. As already mentioned you are essentially increasing the payload by doing so, when it's really far more logical to construct the "mapping" in the client. And ideally the "end" client as already mentioned.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With