Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch difference between MUST and SHOULD bool query

People also ask

When should I use bool query Elasticsearch?

Boolean queries in Elasticsearch are a popular query type because of their versatility and ease of use. Boolean queries, or bool queries, find or match documents by using boolean clauses. For the vast majority of cases, the filtering clause will be used because it can be cached for faster search times.

What is Elasticsearch bool query?

Boolean, or a bool query in Elasticsearch, is a type of search that allows you to combine conditions using Boolean conditions. Elasticsearch will search the document in the specified index and return all the records matching the combination of Boolean clauses.

What is minimum should match Elasticsearch?

Minimum Should Match is another search technique that allows you to conduct a more controlled search on related or co-occurring topics by specifying the number of search terms or phrases in the query that should occur within the records returned.

What is the difference between term and match query in Elasticsearch?

To better search text fields, the match query also analyzes your provided search term before performing a search. This means the match query can search text fields for analyzed tokens rather than an exact term. The term query does not analyze the search term. The term query only searches for the exact term you provide.


must means: The clause (query) must appear in matching documents. These clauses must match, like logical AND.

should means: At least one of these clauses must match, like logical OR.

Basically they are used like logical operators AND and OR. See this.

Now in a bool query:

must means: Clauses that must match for the document to be included.

should means: If these clauses match, they increase the _score; otherwise, they have no effect. They are simply used to refine the relevance score for each document.


Yes you can use multiple filters inside must.


Since this is a popular question, I would like to add that in Elasticsearch version 2 things changed a bit.

Instead of filtered query, one should use bool query in the top level.

If you don't care about the score of must parts, then put those parts into filter key. No scoring means faster search. Also, Elasticsearch will automatically figure out, whether to cache them, etc. must_not is equally valid for caching.

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html

Also, mind that "gte": "now" cannot be cached, because of millisecond granularity. Use two ranges in a must clause: one with now/1h and another with now so that the first can be cached for a while and the second for precise filtering accelerated on a smaller result set.


As said in the documentation:

Must: The clause (query) must appear in matching documents.

Should: The clause (query) should appear in the matching document. In a boolean query with no must clauses, one or more should clauses must match a document. The minimum number of should clauses to match can be set using the minimum_should_match parameter.

In other words, results will have to be matched by all the queries present in the must clause ( or match at least one of the should clauses if there is no must clause.

Since you want your results to satisfy all the queries, you should use must.


You can indeed use filters inside a boolean query.