Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do nested AND and OR filters in ElasticSearch?

My filters are grouped together into categories. I would like to retrieve documents where a document can match any filter in a category, but if two (or more) categories are set, then the document must match any of the filters in ALL categories.

If written in pseudo-SQL it would be:

SELECT * FROM Documents WHERE (CategoryA = 'A') AND (CategoryB = 'B' OR CategoryB = 'C')

I've tried Nested filters like so:

{
    "sort": [{
        "orderDate": "desc"
    }],
    "size": 25,
    "query": {
        "match_all": {}
    },
    "filter": {
        "and": [{
            "nested": {
                "path":"hits._source",
                "filter": {
                    "or": [{
                        "term": {
                            "progress": "incomplete"
                        }
                    }, {
                        "term": {
                            "progress": "completed"
                        }
                    }]
                }
            }
        }, {
            "nested": {
                "path":"hits._source",
                "filter": {
                    "or": [{
                        "term": {
                            "paid": "yes"
                        }
                    }, {
                        "term": {
                            "paid": "no"
                        }
                    }]
                }
            }
        }]
    }
}

But evidently I don't quite understand the ES syntax. Is this on the right track or do I need to use another filter?

like image 289
MHTri Avatar asked Apr 02 '14 12:04

MHTri


People also ask

How do I search in nested fields?

You can search nested fields using dot notation that includes the complete path, such as obj1.name . Multi-level nesting is automatically supported, and detected, resulting in an inner nested query to automatically match the relevant nesting level, rather than root, if it exists within another nested query.

What is a nested field?

When a packed class contains an instance field that is a packed type, the data for that field is packed directly into the containing class. The field is known as a nested field . When reading from a nested field, a small object is created as a pointer to the data.

How does match query work in Elasticsearch?

The match query analyzes any provided text before performing a search. This means the match query can search text fields for analyzed tokens rather than an exact term. (Optional, string) Analyzer used to convert the text in the query value into tokens. Defaults to the index-time analyzer mapped for the <field> .


2 Answers

This should be it (translated from given pseudo-SQL)

{
   "sort": [
      {
        "orderDate": "desc"
      }
    ],
    "size": 25,
    "query":
    {
        "filtered":
        {
            "filter":
            {
                "and":
                [
                    { "term": { "CategoryA":"A" } },
                    {
                        "or":
                        [
                            { "term": { "CategoryB":"B" } },
                            { "term": { "CategoryB":"C" } }
                        ]
                    }
                ]
            }
        }
    }
}

I realize you're not mentioning facets but just for the sake of completeness:

You could also use a filter as the basis (like you did) instead of a filtered query (like I did). The resulting json is almost identical with the difference being:

  • a filtered query will filter both the main results as well as facets
  • a filter will only filter the main results NOT the facets.

Lastly, Nested filters (which you tried using) don't relate to 'nesting filters' like you seemed to believe, but related to filtering on nested-documents (parent-child)

like image 72
Geert-Jan Avatar answered Oct 24 '22 08:10

Geert-Jan


Although I have not understand completely your structure this might be what you need.

You have to think tree-wise. You create a bool where you must (=and) fulfill the embedded bools. Each embedded checks if the field does not exist or else (using should here instead of must) the field must (terms here) be one of the values in the list.

Not sure if there is a better way, and do not know the performance.

{
    "sort": [
        {
            "orderDate": "desc"
        }
    ],
    "size": 25,
    "query": {
        "query": {           #
            "match_all": {}  # These three lines are not necessary
        },                   #
        "filtered": {
            "filter": {
                "bool": {
                    "must": [
                        {
                            "bool": {
                                "should": [
                                    {
                                        "not": {
                                            "exists": {
                                                "field": "progress"
                                            }
                                        }
                                    },
                                    {
                                        "terms": {
                                            "progress": [
                                                "incomplete",
                                                "complete"
                                            ]
                                        }
                                    }
                                ]
                            }
                        },
                        {
                            "bool": {
                                "should": [
                                    {
                                        "not": {
                                            "exists": {
                                                "field": "paid"
                                            }
                                        }
                                    },
                                    {
                                        "terms": {
                                            "paid": [
                                                "yes",
                                                "no"
                                            ]
                                        }
                                    }
                                ]
                            }
                        }
                    ]
                }
            }
        }
    }
}
like image 44
Diolor Avatar answered Oct 24 '22 09:10

Diolor