Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB Projection of Nested Arrays

I've got a collection "accounts" which contains documents similar to this structure:

{
    "email" : "[email protected]",
    "groups" : [
        {
            "name" : "group1",
            "contacts" : [
                { "localId" : "c1", "address" : "some address 1" },
                { "localId" : "c2", "address" : "some address 2" },
                { "localId" : "c3", "address" : "some address 3" }
            ]
        },
        {
            "name" : "group2",
            "contacts" : [
                { "localId" : "c1", "address" : "some address 1" },
                { "localId" : "c3", "address" : "some address 3" }
            ]
        }
    ]
}

Via

q = { "email" : "[email protected]", "groups" : { $elemMatch: { "name" : "group1" } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1" } } }
db.accounts.find( q, p ).pretty()

I'll successfully get just the group of a specified account I'm interested in.

Question: How can I get a limited list of "contacts" within a certain "group" of a specified "account"? Let's suppose I've got the following arguments:

  • account: email - "[email protected]"
  • group: name - "group1"
  • contact: array of localIds - [ "c1", "c3", "Not existing id" ]

Given these arguments I'd like to have the following result:

{
    "groups" : [
        {
            "name" : "group1", (might be omitted)
            "contacts" : [
                { "localId" : "c1", "address" : "some address 1" },
                { "localId" : "c3", "address" : "some address 3" }
            ]
        }
    ]
}

I don't need anything else apart from the resulting contacts.

Approaches

All queries try to fetch just one matching contact instead of a list of matching contacts, for the sake of simplicity. I've tried the following queries without any success:

p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts" : { $elemMatch: { "localId" : "c1" } } } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts.localId" : "c1" } } }
not working: returns whole array or nothing depending on localId


p = { "groups.$" : { $elemMatch: { "localId" : "c1" } } }
error: {
    "$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
    "code" : 17287
}


p = { "groups.contacts" : { $elemMatch: { "localId" : "c1" } } }
error: {
    "$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
    "code" : 17287
}

Any help is appreciated!

like image 501
cbopp Avatar asked Mar 11 '15 09:03

cbopp


People also ask

What is projection argument in MongoDB?

In MongoDB, projection means selecting only the necessary data rather than selecting whole of the data of a document. If a document has 5 fields and you need to show only 3, then select only 3 fields from them.

How do I use $Push in MongoDB?

If the field is absent in the document to update, $push adds the array field with the value as its element. If the field is not an array, the operation will fail. If the value is an array, $push appends the whole array as a single element. To add each element of the value separately, use the $each modifier with $push .

How do I change the nested array element in MongoDB?

Update Nested Arrays in Conjunction with $[]The $[<identifier>] filtered positional operator, in conjunction with the $[] all positional operator, can be used to update nested arrays. The following updates the values that are greater than or equal to 8 in the nested grades.


2 Answers

2017 Update

Such a well put question deserves a modern response. The sort of array filtering requested can actually be done in modern MongoDB releases post 3.2 via simply $match and $project pipeline stages, much like the original plain query operation intends.

db.accounts.aggregate([
  { "$match": {
    "email" : "[email protected]",
    "groups": {
      "$elemMatch": { 
        "name": "group1",
        "contacts.localId": { "$in": [ "c1","c3", null ] }
      }
    }
  }},
  { "$addFields": {
    "groups": {
      "$filter": {
        "input": {
          "$map": {
            "input": "$groups",
            "as": "g",
            "in": {
              "name": "$$g.name",
              "contacts": {
                "$filter": {
                  "input": "$$g.contacts",
                  "as": "c",
                  "cond": {
                    "$or": [
                      { "$eq": [ "$$c.localId", "c1" ] },
                      { "$eq": [ "$$c.localId", "c3" ] }
                    ]
                  } 
                }
              }
            }
          }
        },
        "as": "g",
        "cond": {
          "$and": [
            { "$eq": [ "$$g.name", "group1" ] },
            { "$gt": [ { "$size": "$$g.contacts" }, 0 ] }
          ]
        }
      }
    }
  }}
])

This makes use of of the $filter and $map operators to only return the elements from the arrays as would meet the conditions, and is far better for performance than using $unwind. Since the pipeline stages effectively mirror the structure of "query" and "project" from a .find() operation, the performance here is basically on par with such and operation.

Note that where the intention is to actually work "across documents" to bring details together out of "multiple" documents rather than "one", then this would usually require some type of $unwind operation in order to do so, as such enabling the array items to be accessible for "grouping".


This is basically the approach:

db.accounts.aggregate([
    // Match the documents by query
    { "$match": {
        "email" : "[email protected]",
        "groups.name": "group1",
        "groups.contacts.localId": { "$in": [ "c1","c3", null ] },
    }},

    // De-normalize nested array
    { "$unwind": "$groups" },
    { "$unwind": "$groups.contacts" },

    // Filter the actual array elements as desired
    { "$match": {
        "groups.name": "group1",
        "groups.contacts.localId": { "$in": [ "c1","c3", null ] },
    }},

    // Group the intermediate result.
    { "$group": {
        "_id": { "email": "$email", "name": "$groups.name" },
        "contacts": { "$push": "$groups.contacts" }
    }},

    // Group the final result
    { "$group": {
        "_id": "$_id.email",
        "groups": { "$push": {
            "name": "$_id.name",
            "contacts": "$contacts" 
        }}
    }}
])

This is "array filtering" on more than a single match which the basic projection capabilities of .find() cannot do.

You have "nested" arrays therefore you need to process $unwind twice. Along with the other operations.

like image 92
Neil Lunn Avatar answered Oct 11 '22 00:10

Neil Lunn


You could use the $unwind operator of the aggregation framework. For example:

db.contact.aggregate({$unwind:'$groups'}, {$unwind:'$groups.contacts'}, {$match:{email:'[email protected]', 'groups.name':'group1', 'groups.contacts.localId':{$in:['c1', 'c3', 'whatever']}}});

Should give the following result:

{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "[email protected]", "groups" : { "name" : "group1", "contacts" : { "localId" : "c1", "address" : "some address 1" } } }
{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "[email protected]", "groups" : { "name" : "group1", "contacts" : { "localId" : "c3", "address" : "some address 3" } } }

If you want only one object, you can then use the $group operator.

like image 29
Niabb Avatar answered Oct 11 '22 00:10

Niabb