Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resolving MongoDB DBRef array using Mongo Native Query and working on the resolved documents

My MongoDB collection is made up of 2 main collections :

1) Maps

{
"_id" : ObjectId("542489232436657966204394"),
"fileName" : "importFile1.json",
"territories" : [ 
    {
        "$ref" : "territories",
        "$id" : ObjectId("5424892224366579662042e9")
    }, 
    {
        "$ref" : "territories",
        "$id" : ObjectId("5424892224366579662042ea")
    }
]
},

{
    "_id" : ObjectId("542489262436657966204398"),
    "fileName" : "importFile2.json",
    "territories" : [ 
        {
            "$ref" : "territories",
            "$id" : ObjectId("542489232436657966204395")
        }
    ],
    "uploadDate" : ISODate("2012-08-22T09:06:40.000Z")
}

2) Territories, which are referenced in "Map" objects :

{
    "_id" : ObjectId("5424892224366579662042e9"),
    "name" : "Afghanistan",
    "area" : 653958
},
{
    "_id" : ObjectId("5424892224366579662042ea"),
    "name" : "Angola",
    "area" : 1252651
},
{
    "_id" : ObjectId("542489232436657966204395"),
    "name" : "Unknown",
    "area" : 0
}

My objective is to list every map with their cumulative area and number of territories. I am trying the following query :

db.maps.aggregate(
    {'$unwind':'$territories'},
    {'$group':{
        '_id':'$fileName',
        'numberOf': {'$sum': '$territories.name'}, 
        'locatedArea':{'$sum':'$territories.area'}
        }
    })

However the results show 0 for each of these values :

{
    "result" : [ 
        {
            "_id" : "importFile2.json",
            "numberOf" : 0,
            "locatedArea" : 0
        }, 
        {
            "_id" : "importFile1.json",
            "numberOf" : 0,
            "locatedArea" : 0
        }
    ],
    "ok" : 1
}

I probably did something wrong when trying to access to the member variables of Territory (name and area), but I couldn't find an example of such a case in the Mongo doc. area is stored as an integer, and name as a string.

like image 380
Bruno Pérel Avatar asked Oct 20 '22 01:10

Bruno Pérel


1 Answers

I probably did something wrong when trying to access to the member variables of Territory (name and area), but I couldn't find an example of such a case in the Mongo doc. area is stored as an integer, and name as a string.

Yes indeed, the field "territories" has an array of database references and not the actual documents. DBRefs are objects that contain information with which we can locate the actual documents.

In the above example, you can clearly see this, fire the below mongo query:

db.maps.find({"_id":ObjectId("542489232436657966204394")}).forEach(function(do
c){print(doc.territories[0]);})

it will print the DBRef object rather than the document itself:

o/p: DBRef("territories", ObjectId("5424892224366579662042e9"))

so, '$sum': '$territories.name','$sum': '$territories.area' would show you '0' since there are no fields such as name or area.

So you need to resolve this reference to a document before doing something like $territories.name

To achieve what you want, you can make use of the map() function, since aggregation nor Map-reduce support sub queries, and you already have a self-contained map document, with references to its territories.

Steps to achieve:

a) get each map
b) resolve the `DBRef`.
c) calculate the total area, and the number of territories.
d) make and return the desired structure.

Mongo shell script:

db.maps.find().map(function(doc) {
    var territory_refs = doc.territories.map(function(terr_ref) {
        refName = terr_ref.$ref;
        return terr_ref.$id;
    });
    var areaSum = 0;
    db.refName.find({
        "_id" : {
            $in : territory_refs
        }
    }).forEach(function(i) {
        areaSum += i.area;
    });
    return {
        "id" : doc.fileName,
        "noOfTerritories" : territory_refs.length,
        "areaSum" : areaSum
    };
})

o/p:

[
        {
                "id" : "importFile1.json",
                "noOfTerritories" : 2,
                "areaSum" : 1906609
        },
        {
                "id" : "importFile2.json",
                "noOfTerritories" : 1,
                "areaSum" : 0
        }
]

Map-Reduce functions should not be and cannot be used to resolve DBRefs in the server side. See what the documentation has to say:

The map function should not access the database for any reason.

The map function should be pure, or have no impact outside of the function (i.e. side effects.)

The reduce function should not access the database, even to perform read operations. The reduce function should not affect the outside system.

Moreover, a reduce function even if used(which can never work anyway) will never be called for your problem, since a group w.r.t "fileName" or "ObjectId" would always have only one document, in your dataset.

MongoDB will not call the reduce function for a key that has only a single value

like image 143
BatScream Avatar answered Nov 03 '22 19:11

BatScream