TL;DR - How do I check whether one-of and all-of a nested array meet specified criteria?
I have a document
. Each document
has an array of nested outer
objects, who themselves have a list of nested inner
objects. I need to perform a filter for all documents where at least one of the document's outer
nested objects match. When I say match, I mean that all the outer
nested objects' inner
objects match in some way. Here's an example mapping for reference;
{ "document" : {
"properties" : {
"name" : {
"type" : "string"
},
"outer" : {
"type" : "nested",
"properties" : {
"inner" : {
"type" : "nested",
"properties" : {
"match" : {
"type" : "string",
"index" : "not_analyzed"
},
"type" : {
"type" : "string",
"index" : "not_analyzed"
}
}}}}}}
}
If the document has no outer
/inner
objects it is considered to match. But to make things worse the inner objects need to be considered to match differently depending on the type
in a kind of conditional logic manner (eg CASE
in SQL). For example, if the type
were the term "Country"
then inner
object would be considered to match if the match
were a specified country code such as ES
. A document may have inner
objects of varying type
and there is not guarantee that specific types will exist.
Coming from a imperative (Java) programming background I am having incredible trouble figuring out how to implement this kind of filtering. Nothing I can think of even vaguely matches this behaviour. Thus far all I have is the filtered query;
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"bool" : {
"should" : {
"missing" : {
"field" : "outer.inner.type"
}
}}}}
}
So, the question is...
How can I filter to documents who have at least one outer
object which has all inner
objects matching based on the type
of inner
object?
Further details By Request -
{
"name":"First",
"outer":[
{
"inner":[
{"match":"ES","type":"Country"},
{"match":"Elite","type":"Market"}
]
},{
"inner":[
{"match":"GBR","type":"Country"},
{"match":"1st Class","type":"Market"},
{"match":"Admin","type":"Role"}
]
}
],
"lockVersion":0,"sourceId":"1"
}
The above example should come through the filter if we were to provide "1st Class"
market and the country "GRB"
because the second of the two outer
objects would be considered a match because both inner
objects match. If, however, we provided the country country "GRB"
and the market "Elite"
then we would not have this document returned because neither of the outer
objects would have bother of their inner
objects match in their entirety. If we wanted the second outer
object to match then all three inner
would need to match. Take note that there is an extra type
in the third inner
. This leads to a situation where if a type exists then it needs to have a match for it else it doesn't need to match because it is absent.
The nested type is a specialised version of the object data type that allows arrays of objects to be indexed in a way that they can be queried independently of each other.
Frequently used filters will be cached automatically by Elasticsearch, to speed up performance. Filter context is in effect whenever a query clause is passed to a filter parameter, such as the filter or must_not parameters in the bool query, the filter parameter in the constant_score query, or the filter aggregation.
When a packed class contains an instance field that is a packed type, the data for that field is packed directly into the containing class. The field is known as a nested field .
Having one of a nested array matching some criteria turns out to be very simple. A nested filter evaluates to matching/true if any of the array of nested objects match the specified inner filters. For example, given an array of outer
objects where one of those objects has a field match
with the value "matching"
the following would be considered true.
"nested": {
"path": "outer",
"filter": {
"term" : { "match" : "matching" }
}
}
The above will be considered true/matching if one of the nested outer
objects has a field called match
with the value "matching"
.
Having a nested filter only be considered matching if all of the nested objects in an array match is more interesting. In fact, it's impossible. But given that it is considered matching if only one of the nested objects match a filter we can reverse the logic and say "If none of the nested objects don't match" to achieve what we need. For example, given an array of nested outer.inner
objects where all of those objects has a field match
with the value "matching"
the following would be considered true.
"not" : {
"nested": {
"path": "outer.inner",
"filter": {
"not" : {
"term" : { "match" : "matching" }
}
}
}
}
The above will be considered true/matching because none of the nested outer.inner
objects don't (double negative) have a field called match
with the value "matching"
. This, of course, is the same as all of the nested inner
objects having a field match
with the value "matching"
.
You can't check whether a field containing nested objects is missing using the traditional missing filter. This is because nested objects aren't actually in the document at all, they are stored somewhere else. As such missing filters will always be considered true. What you can do however, is check that a match_all
filter returns no results like so;
"not": {
"nested": {
"path": "outer",
"filter": {
"match_all": {}
}
}
}
This is considered true/matching if match_all
finds no results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With