Let's say I have following documents indexed:
[
{
"Id": 1,
"Numbers": [1, 2, 3]
},
{
"Id": 2,
"Numbers": [4, 5]
}
]
I have a parameter [1,2,4,5], which defines which numbers I am not allowed to see - I want to find documents where "Numbers" array contains at least one element NOT in my input array (so in this case first document should be returned).
Real scenario is for finding groups which (or who's child groups) do not contain products belonging to certain product type. I have recursively indexed product type ids (represented as numbers in the example) and I want to find groups which contain products not belonging to my input parameter (my input parameter being an array of product type ids I am not allowed to see)
Which query/filter should I use and how should it be constructed? I have considered the following:
return desc.Bool(b => b
.MustNot(mn => mn.Bool(mnb => mnb.Must(mnbm => mnbm.Terms(t => t.ItemGroups, permissions.RestrictedItemGroups) && mnbm.Term(t => t.ItemGroupCount, permissions.RestrictedItemGroups.Count())))));
but the problem is if I have 6 restricted item groups, where as a given group contains 3 restricted groups, then I won't find any matches because the count won't match. That makes quite a bit of sense now. As a workaround I've implemented Results.Except(Restricted) in C# to filter out restricted groups post-search, but would love to implement it in elasticsearch.
New answer
I'm leaving the older answer below as it might be of use to other people. In your case, you want to filter out documents that don't match and not only flag them. So, the following query would get you what you expect, i.e. only the first document:
POST test/_search
{
"query": {
"script": {
"script": {
"source": """
// copy the doc values into a temporary list
def tmp = new ArrayList(doc.Numbers.values);
// remove all ids from the params
tmp.removeIf(n -> params.ids.contains((int)n));
// return true if the array still contains ids, false if not
return tmp.size() > 0;
""",
"params": {
"ids": [
1,
2,
4,
5
]
}
}
}
}
}
Older answer
One way to solve this is by using a script field which will return true or false depending on your condition:
POST test/_search
{
"_source": true,
"script_fields": {
"not_present": {
"script": {
"source": """
// copy the numbers array
def tmp = params._source.Numbers;
// remove all ids from the params
tmp.removeIf(n -> params.ids.contains(n));
// return true if the array still contains data, false if not
return tmp.length > 0;
""",
"params": {
"ids": [ 1, 2, 4, 5 ]
}
}
}
}
}
The result would look like this:
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_type" : "doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"Id" : 2,
"Numbers" : [
4,
5
]
},
"fields" : {
"not_present" : [
false <--- you don't want this doc
]
}
},
{
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"Id" : 1,
"Numbers" : [
1,
2,
3
]
},
"fields" : {
"not_present" : [
true <--- you want this one, though
]
}
}
]
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With