Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb Aggregate a $slice to get an element in exact position from a nested array

I would like to retrieve a value from a nested array where it exists at an exact position within the array.

I want to create name value pairs by doing $slice[0,1] for the name and then $slice[1,1] for the value.

Before I attempt to use aggregate, I want to attempt a find within a nested array. I can do what I want on a single depth array in a document as shown below:

{
"_id" : ObjectId("565cc5261506995581569439"),
"a" : [ 
    4, 
    2, 
    8, 
    71, 
    21
]
}

I apply the following: db.getCollection('anothertest').find({},{_id:0, a: {$slice:[0,1]}}) and I get:

{
"a" : [ 
    4
]
}

This is fantastic. However, what if the array I want to $slice [0,1] is located within the document at objectRawOriginData.Reports.Rows.Rows.Cells?

If I can first of all FIND then I want to apply the same as an AGGREGATE.

like image 483
Matt Lightbourn Avatar asked Nov 30 '15 22:11

Matt Lightbourn


1 Answers

Your best bet here and especially if your application is not yet ready for release is to hold off until MongoDB 3.2 for deployment, or at least start working with a release candidate in the interim. The main reason being is that the "projection" $slice does not work with the aggregation framework, as do not other forms of array matching projection as well. But this has been addressed for the upcoming release.

This is going to give you a couple of new operators, being $slice and even $arrayElemAt which can be used to address array elements by position in the aggregation pipeline.

Either:

db.getCollection('anothertest').aggregate([
    { "$project": {
        "_id": 0,
        "a": { "$slice": ["$a",0,1] }
    }}
])

Which returns the familiar:

{ "a" : [ 4 ] }

Or:

db.getCollection('anothertest').aggregate([
    { "$project": {
        "_id": 0,
        "a": { "$arrayElemAt": ["$a", 0] }
    }}
])

Which is just the element and not an array:

{ "a" : 4 }

Until that release becomes available other than in release candidate form, the currently available operators make it quite easy for the "first" element of the array:

db.getCollection('anothertest').aggregate([
    { "$unwind": "$a" },
    { "$group": {
        "_id": "$_id",
        "a": { "$first": "$a" }
    }}
])

Through use of the $first operator after $unwind. But getting another indexed position becomes horribly iterative:

db.getCollection('anothertest').aggregate([
    { "$unwind": "$a" },
    // Keeps the first element
    { "$group": {
        "_id": "$_id",
        "first": { "$first": "$a" },
        "a": { "$push": "$a" }
    }},
    { "$unwind": "$a" },
    // Removes the first element
    { "$redact": {
        "$cond": {
            "if": { "$ne": [ "$first", "$a" ] },
            "then": "$$KEEP",
            "else": "$$PRUNE"
        }
    }},
    // Top is now the second element
    { "$group": {
        "_id": "$_id",
        "second": { "$first": "$a" }
    }}
])

And so on, and also a lot of handling to alter that to deal with arrays that might be shorter than the "nth" element you are looking for. So "possible", but ugly and not performant.

Also noting that is "not really" working with "indexed positions", and is purely matching on values. So duplicate values would easily be removed, unless there was another unique identifier per array element to work with. Future $unwind also has the ability to project an array index, which is handy for other purposes, but the other operators are more useful for this specific case than that feature.

So for my money I would wait till you had the feature available to be able to integrate this in an aggregation pipeline, or at least re-consider why you believe you need it and possibly design around it.

like image 97
Blakes Seven Avatar answered Oct 27 '22 09:10

Blakes Seven