Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elasticsearch bool query combine must with OR

People also ask

Should and must not Elasticsearch?

Using must_not tells Elasticsearch that document matches cannot include any of the queries that fall under the must_not clause. should – It would be ideal for the matching documents to include all of the queries in the should clause, but they do not have to be included. Scoring is used to rank the matches.

What is bool query in Elasticsearch?

Boolean, or a bool query in Elasticsearch, is a type of search that allows you to combine conditions using Boolean conditions. Elasticsearch will search the document in the specified index and return all the records matching the combination of Boolean clauses.

Should must Elasticsearch?

must means: Clauses that must match for the document to be included. should means: If these clauses match, they increase the _score ; otherwise, they have no effect. They are simply used to refine the relevance score for each document. Yes you can use multiple filters inside must .

What is minimum should match Elasticsearch?

Minimum Should Match is another search technique that allows you to conduct a more controlled search on related or co-occurring topics by specifying the number of search terms or phrases in the query that should occur within the records returned.


  • OR is spelled should
  • AND is spelled must
  • NOR is spelled should_not

Example:

You want to see all the items that are (round AND (red OR blue)):

{
    "query": {
        "bool": {
            "must": [
                {
                    "term": {"shape": "round"}
                },
                {
                    "bool": {
                        "should": [
                            {"term": {"color": "red"}},
                            {"term": {"color": "blue"}}
                        ]
                    }
                }
            ]
        }
    }
}

You can also do more complex versions of OR, for example if you want to match at least 3 out of 5, you can specify 5 options under "should" and set a "minimum_should" of 3.

Thanks to Glen Thompson and Sebastialonso for finding where my nesting wasn't quite right before.

Thanks also to Fatmajk for pointing out that "term" becomes "match" in ElasticSearch 6.


I finally managed to create a query that does exactly what i wanted to have:

A filtered nested boolean query. I am not sure why this is not documented. Maybe someone here can tell me?

Here is the query:

GET /test/object/_search
{
  "from": 0,
  "size": 20,
  "sort": {
    "_score": "desc"
  },
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "state": 1
              }
            }
          ]
        }
      },
      "query": {
        "bool": {
          "should": [
            {
              "bool": {
                "must": [
                  {
                    "match": {
                      "name": "foo"
                    }
                  },
                  {
                    "match": {
                      "name": "bar"
                    }
                  }
                ],
                "should": [
                  {
                    "match": {
                      "has_image": {
                        "query": 1,
                        "boost": 100
                      }
                    }
                  }
                ]
              }
            },
            {
              "bool": {
                "must": [
                  {
                    "match": {
                      "info": "foo"
                    }
                  },
                  {
                    "match": {
                      "info": "bar"
                    }
                  }
                ],
                "should": [
                  {
                    "match": {
                      "has_image": {
                        "query": 1,
                        "boost": 100
                      }
                    }
                  }
                ]
              }
            }
          ],
          "minimum_should_match": 1
        }
      }    
    }
  }
}

In pseudo-SQL:

SELECT * FROM /test/object
WHERE 
    ((name=foo AND name=bar) OR (info=foo AND info=bar))
AND state=1

Please keep in mind that it depends on your document field analysis and mappings how name=foo is internally handled. This can vary from a fuzzy to strict behavior.

"minimum_should_match": 1 says, that at least one of the should statements must be true.

This statements means that whenever there is a document in the resultset that contains has_image:1 it is boosted by factor 100. This changes result ordering.

"should": [
  {
    "match": {
      "has_image": {
        "query": 1,
        "boost": 100
      }
    }
   }
 ]

Have fun guys :)


This is how you can nest multiple bool queries in one outer bool query this using Kibana,

  • bool indicates we are using boolean
  • must is for AND
  • should is for OR
GET my_inedx/my_type/_search
{
  "query" : {
     "bool": {             //bool indicates we are using boolean operator
          "must" : [       //must is for **AND**
               {
                 "match" : {
                       "description" : "some text"  
                   }
               },
               {
                  "match" :{
                        "type" : "some Type"
                   }
               },
               {
                  "bool" : {          //here its a nested boolean query
                        "should" : [  //should is for **OR**
                               {
                                 "match" : {
                                     //ur query
                                }
                               },
                               { 
                                  "match" : {} 
                               }     
                             ]
                        }
               }
           ]
      }
  }
}

This is how you can nest a query in ES


There are more types in "bool" like,

  1. Filter
  2. must_not

I recently had to solve this problem too, and after a LOT of trial and error I came up with this (in PHP, but maps directly to the DSL):

'query' => [
    'bool' => [
        'should' => [
            ['prefix' => ['name_first' => $query]],
            ['prefix' => ['name_last' => $query]],
            ['prefix' => ['phone' => $query]],
            ['prefix' => ['email' => $query]],
            [
                'multi_match' => [
                    'query' => $query,
                    'type' => 'cross_fields',
                    'operator' => 'and',
                    'fields' => ['name_first', 'name_last']
                ]
            ]
        ],
        'minimum_should_match' => 1,
        'filter' => [
            ['term' => ['state' => 'active']],
            ['term' => ['company_id' => $companyId]]
        ]
    ]
]

Which maps to something like this in SQL:

SELECT * from <index> 
WHERE (
    name_first LIKE '<query>%' OR
    name_last LIKE '<query>%' OR
    phone LIKE  '<query>%' OR
    email LIKE '<query>%'
)
AND state = 'active'
AND company_id = <query>

The key in all this is the minimum_should_match setting. Without this the filter totally overrides the should.

Hope this helps someone!


If you were using Solr's default or Lucene query parser, you can pretty much always put it into a query string query:

POST test/_search
{
  "query": {
    "query_string": {
      "query": "(( name:(+foo +bar) OR info:(+foo +bar)  )) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)"
    }
  }
}

That said, you may want to use a boolean query, like the one you already posted, or even a combination of the two.