Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between must_not and filter in elasticsearch

Can someone explain to me what the difference between must_not and filter is in elasticsearch?

E.g. here (taken from elasticsearch definitive guide), why isn't must_not also used for the range?

{
    "bool": {
        "must":     { "match": { "title": "how to make millions" }},
        "must_not": { "match": { "tag":   "spam" }},
        "should": [
            { "match": { "tag": "starred" }}
        ],
        "filter": {
          "range": { "date": { "gte": "2014-01-01" }} 
        }
    }
}

Specifically looking at this documentation, it appears to me that they are exactly the same:

filter: The clause (query) must appear in matching documents. However unlike must the score of the query will be ignored. Filter clauses are executed in filter context, meaning that scoring is ignored and clauses are considered for caching.

must_not: The clause (query) must not appear in the matching documents. Clauses are executed in filter context meaning that scoring is ignored and clauses are considered for caching. Because scoring is ignored, a score of 0 for all documents is returned.

like image 846
schneida Avatar asked Nov 10 '17 15:11

schneida


People also ask

What are filters in Elasticsearch?

A filter in Elasticsearch is all about applying some conditions inside the query that are used to narrow down the matching result set.

What is filter and query?

The difference between query filters and report filtersFilters you apply to the query definition are called query filters. You use query filters to reduce the amount of data retrieved from the data source.

What is bool query in Elasticsearch?

The bool query is a go-to query because it allows you to construct an advanced query by chaining together several simple ones. The results must match the queries in this clause. If you have multiple queries, every single one must match. Acts as an and operator.

What is term query in Elasticsearch?

Term queryedit. Returns documents that contain an exact term in a provided field. You can use the term query to find documents based on a precise value such as a price, a product ID, or a username. Avoid using the term query for text fields.


2 Answers

The filter is used when the matched documents need to be shown in the result, while must_not is used when the matched documents will not be shown in the results. For further analysis:

filter:

  1. It is written in Filter context.
  2. It does not affect the score of the result.
  3. The matched query results will appear in the result.
  4. Exact match based, not partial match.

must_not:

  1. It is written again on the same filter context.
  2. Which means it will not affect the score of the result.
  3. The documents matched with this condition will NOT appear in the result.
  4. Exact match based.

Tabular comparision

like image 169
Soumendra Avatar answered Oct 10 '22 00:10

Soumendra


Basically, filter = must but without scoring.

must_not expresses a condition that MUST NOT be met, while filter (and must) express conditions that MUST be met in order for a document to be selected.

like image 22
Val Avatar answered Oct 10 '22 01:10

Val