In my mongoDB book collection I have documents structured as follow :
/* 0 */
{
"_id" : ObjectId("50485b89b30f1ea69110ff4c"),
"publisher" : {
"$ref" : "boohya",
"$id" : "foo"
},
"displayName" : "Paris Nightlife",
"catalogDescription" : "Some desc goes here",
"languageCode" : "en",
"rating" : 0,
"status" : "LIVE",
"thumbnailId" : ObjectId("50485b89b30f1ea69110ff4b"),
"indexTokens" : ["Nightlife", "Paris"]
}
I perform the following regex query to find all documents having one indexToken starting with "Par" :
{ "indexTokens" : { "$regex" : "^Par" , "$options" : "i"}}
If I select only the indexTokens field to be returned like this :
{ "indexTokens" : 1}
The resulting DBObject is
{ "_id" : { "$oid" : "50485b89b30f1ea69110ff4c"} , "indexTokens" : [ "Nightlife" , "Paris"]}
What I would like to get is ONLY the token / tag that matched the regex (I don0t care about retrieving the document at this point, neither do I need all the tags of the matched document)
Is this a case for the new Aggregation Framework relesed under MongoDB v2.2. ?
If yes how do I modify my query so that the actual result would look like :
{ "indexTokens" : ["Paris", "Paradise River", "Parma" , etc ....]}
Bonus question (do you has teh codez) : How do I do it using the Java driver ?
For now my java looks like :
DBObject query = new BasicDBObject("indexTokens", java.util.regex.Pattern.compile("^"+filter+"", Pattern.CASE_INSENSITIVE));
BasicDBObject fields = new BasicDBObject("indexTokens",1);
DBCursor curs = getCollection()
.find(query, fields)
.sort( new BasicDBObject( "indexTokens" , 1 ))
.limit(maxSuggestionCount);
Thx :)
EDIT:
As per your answers I modified my JAVA code as follow :
BasicDBObject cmdBody = new BasicDBObject("aggregate", "Book");
ArrayList<BasicDBObject> pipeline = new ArrayList<BasicDBObject>();
BasicDBObject match = new BasicDBObject("$match", new BasicDBObject("indexTokens", java.util.regex.Pattern.compile("^"+titleFilter+"", Pattern.CASE_INSENSITIVE)));
BasicDBObject unwind = new BasicDBObject("$unwind", "$indexTokens");
BasicDBObject match2 = new BasicDBObject("$match", new BasicDBObject("indexTokens", java.util.regex.Pattern.compile("^"+titleFilter+"", Pattern.CASE_INSENSITIVE)));
BasicDBObject groupFilters = new BasicDBObject("_id",null);
groupFilters.append("indexTokens", new BasicDBObject( "$push", "$indexTokens"));
BasicDBObject group = new BasicDBObject("$group", groupFilters);
pipeline.add(match);
pipeline.add(unwind);
pipeline.add(match2);
pipeline.add(group);
cmdBody.put("pipeline", pipeline);
CommandResult res = getCollection().getDB().command(cmdBody);
System.out.println(res);
Which outputs
{ "result" : [ { "_id" : null , "indexTokens" : [ "Paris"]}] , "ok" : 1.0}
This is genius !
Thanks a lot !
Building on the response from cirrus, I recommend doing the $unwind
first to avoid the redundant $match
. Something like:
db.books.aggregate(
{$unwind:"$indexTokens"},
{$match:{indexTokens:/^Par/}},
{$group:{_id:null,indexTokens:{$push:"$indexTokens"}}
})
How do you do this in Java? You can use the DBCollection.aggregate(...)
method of the MongoDB v2.9.0 driver. Each pipeline operator, eg. $unwind
or $match
, corresponds to a DBObject
object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With