I am trying to find the ancestors of a given child. At some point, that child's ancestors last name changed. I want to find the last parent of this child with a given last name. For example:
{
"_id":1,
"parent":null
"first":"Bob",
"last":"Sagget"
},
{
"_id":2,
"parent":1,
"first":"Jane",
"last":"Dor"
},
{
"_id":3,
"parent":2,
"first":"Crane",
"last":"Dor"
},
{
"_id":4,
"parent":3,
"first":"Ho",
"last":"Dor"
},
{
"_id":5,
"parent":4,
"first":"Mor",
"last":"Dor"
}
I want to query for _id
5 and get the ancestors with the last name Dor. There will be other people with the name Dor in this database that I don't want to see, so I can't just query for last name of "Dor".
Here is my current aggregation query - this is giving me every ancestor all the way to _id
of 1. How can I stop at _id
of 2?:
db.PeopleDb.aggregate(
[
{
$graphLookup: {
"from": "PeopleDb",
"startWith": "$parent",
"connectFromField": "parent",
"connectToField": "_id",
"as": "linearAncestors"
}
},
{
$match: {
"_id":5
}
},
]);
The basic query you are looking for instead actually "starts" with the restriction on matching just the singular document with _id: 5
and then performs the $graphLookup
in order to find the ancestors.
As for the "condition", there are some steps involved so it is best to walk through the whole process and understand what is happening:
db.PeopleDb.aggregate([
{ "$match": { "_id": 5 } },
{ "$graphLookup": {
"from": "PeopleDb",
"startWith": "$parent",
"connectFromField": "parent",
"connectToField": "_id",
"as": "people",
"depthField": "depth"
}}
])
This then returns all the recursive chain connecting the parent
to the _id
, noting also the "optional" "depthField"
setting to include the actual "depth" of the match within the returned results:
{
"_id" : 5,
"parent" : 4,
"first" : "Mor",
"last" : "Dor",
"people" : [
{
"_id" : 1,
"parent" : null,
"first" : "Bob",
"last" : "Sagget",
"depth" : NumberLong(3)
},
{
"_id" : 2,
"parent" : 1,
"first" : "Jane",
"last" : "Dor",
"depth" : NumberLong(2)
},
{
"_id" : 3,
"parent" : 2,
"first" : "Crane",
"last" : "Dor",
"depth" : NumberLong(1)
},
{
"_id" : 4,
"parent" : 3,
"first" : "Ho",
"last" : "Dor",
"depth" : NumberLong(0)
}
]
}
So noting that 4
is returned since it is the parent of the initial document, then 3
for it's parent and then 2
as it's parent and so on.
You can restrict the "depth" of matching using the "optional" "maxDepth"
parameter to the pipeline stage, or in the specific case of "excluding the ROOT" element you can use the "restrictSearchWithMatch"
option to simply exclude results with a null
parent:
db.PeopleDb.aggregate([
{ "$match": { "_id": 5 } },
{ "$graphLookup": {
"from": "PeopleDb",
"startWith": "$parent",
"connectFromField": "parent",
"connectToField": "_id",
"as": "people",
"depthField": "depth",
"restrictSearchWithMatch": { "parent": { "$ne": null } }
}}
])
Which returns the same result, but excludes the "ROOT" document where the "parent"
field is null
:
{
"_id" : 5,
"parent" : 4,
"first" : "Mor",
"last" : "Dor",
"people" : [
{
"_id" : 2,
"parent" : 1,
"first" : "Jane",
"last" : "Dor",
"depth" : NumberLong(2)
},
{
"_id" : 3,
"parent" : 2,
"first" : "Crane",
"last" : "Dor",
"depth" : NumberLong(1)
},
{
"_id" : 4,
"parent" : 3,
"first" : "Ho",
"last" : "Dor",
"depth" : NumberLong(0)
}
]
}
And of course the same principle applies to your "last"
condition which can match only documents where that condition was true. Here I'll show both conditions, but the $or
is optional:
db.PeopleDb.aggregate([
{ "$match": { "_id": 5 } },
{ "$graphLookup": {
"from": "PeopleDb",
"startWith": "$parent",
"connectFromField": "parent",
"connectToField": "_id",
"as": "people",
"depthField": "depth",
"restrictSearchWithMatch": {
"$or": [
{ "parent": { "$ne": null }},
{ "last": "Dor" }
]
}
}}
])
Note however that the "restrictSearchWithMatch"
is a "recursive" condition, therefore if any "ancestor" in the chain does not meet the "last"
condition then the chain is broken and further ancestors are not retrieved. In order to get "all the ancestors" yet only show those who have the matching "last"
then you $filter
the resulting array content instead:
db.PeopleDb.aggregate([
{ "$match": { "_id": 5 } },
{ "$graphLookup": {
"from": "PeopleDb",
"startWith": "$parent",
"connectFromField": "parent",
"connectToField": "_id",
"as": "people",
"depthField": "depth",
}},
{ "$addFields": {
"people": {
"$filter": {
"input": "$people",
"cond": { "$eq": [ "$$this.last", "Dor" ] }
}
}
}}
])
Or in fact "dynamically" by comparing to the value of the initial matched document to which the "ancestors" are related to by using the field value expression instead of a hard coded value:
db.PeopleDb.aggregate([
{ "$match": { "_id": 5 } },
{ "$graphLookup": {
"from": "PeopleDb",
"startWith": "$parent",
"connectFromField": "parent",
"connectToField": "_id",
"as": "people",
"depthField": "depth",
}},
{ "$addFields": {
"people": {
"$filter": {
"input": "$people",
"cond": { "$eq": [ "$$this.last", "$last" ] }
}
}
}}
])
In this case it's the same 2,3,4
ancestors, but in the event that say ancestor 3
actually had a different value for "last"
then using $filter
actually returns 2,4
whilst the "restrictSearchWithMatch"
would only return 4
since the 3
would "break the chain". That's the main difference.
N.B The one thing you cannot do of yet using
$graphLookup
is that "field comparison expression" is not allowed. If you wanted something along those lines then you would do further manipulation with$filter
and possibly other operations such as$indexOfArray
if your actual intent was indeed to "break the chain" at the point where that "field comparison" was not met within a recursive search.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With