Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb Aggregation : How to return only matching elements of an array [duplicate]

In my mongoDB book collection I have documents structured as follow :

/* 0 */
{
  "_id" : ObjectId("50485b89b30f1ea69110ff4c"),

  "publisher" : {
    "$ref" : "boohya",
    "$id" : "foo"
  },
  "displayName" : "Paris Nightlife",
  "catalogDescription" : "Some desc goes here",
  "languageCode" : "en",
  "rating" : 0,
  "status" : "LIVE",
  "thumbnailId" : ObjectId("50485b89b30f1ea69110ff4b"),
  "indexTokens" : ["Nightlife", "Paris"]
}

I perform the following regex query to find all documents having one indexToken starting with "Par" :

{ "indexTokens" : { "$regex" : "^Par" , "$options" : "i"}}

If I select only the indexTokens field to be returned like this :

{ "indexTokens" : 1}

The resulting DBObject is

{ "_id" : { "$oid" : "50485b89b30f1ea69110ff4c"} , "indexTokens" : [ "Nightlife" , "Paris"]}

What I would like to get is ONLY the token / tag that matched the regex (I don0t care about retrieving the document at this point, neither do I need all the tags of the matched document)

Is this a case for the new Aggregation Framework relesed under MongoDB v2.2. ?

If yes how do I modify my query so that the actual result would look like :

{ "indexTokens" : ["Paris", "Paradise River", "Parma" , etc ....]}

Bonus question (do you has teh codez) : How do I do it using the Java driver ?

For now my java looks like :

DBObject query = new BasicDBObject("indexTokens", java.util.regex.Pattern.compile("^"+filter+"", Pattern.CASE_INSENSITIVE));
    BasicDBObject fields = new BasicDBObject("indexTokens",1);
    DBCursor curs = getCollection()
                    .find(query, fields)
                    .sort( new BasicDBObject( "indexTokens" , 1 ))
                    .limit(maxSuggestionCount);

Thx :)

EDIT:

As per your answers I modified my JAVA code as follow :

BasicDBObject cmdBody = new BasicDBObject("aggregate", "Book"); 
    ArrayList<BasicDBObject> pipeline = new ArrayList<BasicDBObject>(); 

    BasicDBObject match = new BasicDBObject("$match", new BasicDBObject("indexTokens", java.util.regex.Pattern.compile("^"+titleFilter+"", Pattern.CASE_INSENSITIVE)));
    BasicDBObject unwind = new BasicDBObject("$unwind", "$indexTokens");
    BasicDBObject match2 = new BasicDBObject("$match", new BasicDBObject("indexTokens", java.util.regex.Pattern.compile("^"+titleFilter+"", Pattern.CASE_INSENSITIVE)));
    BasicDBObject groupFilters = new BasicDBObject("_id",null);
    groupFilters.append("indexTokens", new BasicDBObject( "$push", "$indexTokens"));
    BasicDBObject group = new BasicDBObject("$group", groupFilters);

    pipeline.add(match);
    pipeline.add(unwind);
    pipeline.add(match2);
    pipeline.add(group);

    cmdBody.put("pipeline", pipeline); 



    CommandResult res = getCollection().getDB().command(cmdBody);
    System.out.println(res);

Which outputs

{ "result" : [ { "_id" :  null  , "indexTokens" : [ "Paris"]}] , "ok" : 1.0}

This is genius !

Thanks a lot !

like image 299
azpublic Avatar asked Sep 06 '12 09:09

azpublic


1 Answers

Building on the response from cirrus, I recommend doing the $unwind first to avoid the redundant $match. Something like:

db.books.aggregate(
    {$unwind:"$indexTokens"},
    {$match:{indexTokens:/^Par/}},
    {$group:{_id:null,indexTokens:{$push:"$indexTokens"}}
})

How do you do this in Java? You can use the DBCollection.aggregate(...) method of the MongoDB v2.9.0 driver. Each pipeline operator, eg. $unwind or $match, corresponds to a DBObject object.

like image 68
slee Avatar answered Sep 30 '22 04:09

slee