Basically i'm trying to implement tags functionality on a model.
> db.event.distinct("tags")
[ "bar", "foo", "foobar" ]
Doing a simple distinct query retrieves me all distinct tags. However how would i go about getting all distinct tags that match a certain query? Say for example i wanted to get all tags matching foo
and then expecting to get ["foo","foobar"]
as a result?
The following queries is my failed attempts of achieving this:
> db.event.distinct("tags",/foo/)
[ "bar", "foo", "foobar" ]
> db.event.distinct("tags",{tags: {$regex: 'foo'}})
[ "bar", "foo", "foobar" ]
The aggregation framework and not the .distinct()
command:
db.event.aggregate([
// De-normalize the array content to separate documents
{ "$unwind": "$tags" },
// Filter the de-normalized content to remove non-matches
{ "$match": { "tags": /foo/ } },
// Group the "like" terms as the "key"
{ "$group": {
"_id": "$tags"
}}
])
You are probably better of using an "anchor" to the beginning of the regex is you mean from the "start" of the string. And also doing this $match
before you process $unwind
as well:
db.event.aggregate([
// Match the possible documents. Always the best approach
{ "$match": { "tags": /^foo/ } },
// De-normalize the array content to separate documents
{ "$unwind": "$tags" },
// Now "filter" the content to actual matches
{ "$match": { "tags": /^foo/ } },
// Group the "like" terms as the "key"
{ "$group": {
"_id": "$tags"
}}
])
That makes sure you are not processing $unwind
on every document in the collection and only those that possibly contain your "matched tags" value before you "filter" to make sure.
The really "complex" way to somewhat mitigate large arrays with possible matches takes a bit more work, and MongoDB 2.6 or greater:
db.event.aggregate([
{ "$match": { "tags": /^foo/ } },
{ "$project": {
"tags": { "$setDifference": [
{ "$map": {
"input": "$tags",
"as": "el",
"in": { "$cond": [
{ "$eq": [
{ "$substr": [ "$$el", 0, 3 ] },
"foo"
]},
"$$el",
false
]}
}},
[false]
]}
}},
{ "$unwind": "$tags" },
{ "$group": { "_id": "$tags" }}
])
So $map
is a nice "in-line" processor of arrays but it can only go so far. The $setDifference
operator negates the false
matches, but ultimately you still need to process $unwind
to do the remaining $group
stage for distinct values overall.
The advantage here is that arrays are now "reduced" to only the "tags" element that matches. Just don't use this when you want a "count" of the occurrences when there are "multiple distinct" values in the same document. But again, there are other ways to handle that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With