Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query Vs Filter in Elastic Search

Tags:

I am trying to index a document which has three fields first_name, last_name, occupation of type "keyword" and has values XYZ, ABC, DEF respectively.

I have written query using filter for an exact match with AND condition as follows,

"query": {
  "bool": {
    "filter": [
      {"term": {"first_name": "XYZ"}},
      {"term": {"last_name": "ABC"}}
    ]
  }
}

This has to return one document, but returns nothing.

I have another query for the same operation,

"query": {
  "bool": {
    "must": [
      {"match": {"first_name": "XYZ"}},
      {"match": {"last_name": "ABC"}}
    ]
  }
}

This returns one document.

According to Elasticsearch documentation, I understand that the difference between query and filter is that filter does not score the result. I am not sure why the first query does not return any result. Is my understanding correct?

like image 586
archura Avatar asked Oct 29 '18 20:10

archura


People also ask

What is the difference between filter and query in Elasticsearch?

The query parameter indicates query context. The bool and two match clauses are used in query context, which means that they are used to score how well each document matches. The filter parameter indicates filter context. Its term and range clauses are used in filter context.

What is a query in Elasticsearch?

A query is made up of two clauses − Leaf Query Clauses − These clauses are match, term or range, which look for a specific value in specific field. Compound Query Clauses − These queries are a combination of leaf query clauses and other compound queries to extract the desired information.

What are filters in Elasticsearch?

A filter in Elasticsearch is all about applying some conditions inside the query that are used to narrow down the matching result set.


1 Answers

As documentation states there is no difference between query and filter except scoring. Of course this applies to the situation when both query and filters uses the same query type. Here you are using two different types - term and match. term is designed for exact comparison while match is analyzed and used as full-text search.

Take a look at the example below.

Your mapping:

PUT /index_53053054
{
  "mappings": {
    "_doc": {
      "properties": {
        "first_name": {
          "type": "text"
        },
        "last_name": {
          "type": "text"
        },
        "occupation": {
          "type": "keyword"
        }
      }
    }
  }
}

Your document:

PUT index_53053054/_doc/1
{
  "first_name": "XYZ",
  "last_name": "ABC",
  "occupation": "DEF"
}

filter query:

GET index_53053054/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "match": {
            "first_name": "XYZ"
          }
        },
        {
          "match": {
            "last_name": "ABC"
          }
        },
        {
          "term": {
            "occupation": "DEF"
          }
        }
      ]
    }
  }
}

and result:

{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0,
    "hits": [
      {
        "_index": "index_53053054",
        "_type": "_doc",
        "_id": "1",
        "_score": 0,
        "_source": {
          "first_name": "XYZ",
          "last_name": "ABC",
          "occupation": "DEF"
        }
      }
    ]
  }
}

Similar must query:

GET index_53053054/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "first_name": "XYZ"
          }
        },
        {
          "match": {
            "last_name": "ABC"
          }
        },
        {
          "term": {
            "occupation": "DEF"
          }
        }
      ]
    }
  }
}

and response:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.8630463,
    "hits": [
      {
        "_index": "index_53053054",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.8630463,
        "_source": {
          "first_name": "XYZ",
          "last_name": "ABC",
          "occupation": "DEF"
        }
      }
    ]
  }
}

As you can see hits are almost the same. The only difference is that in filter score is not calculated while in must query is.

Read more: https://www.elastic.co/guide/en/elasticsearch/reference/6.4/query-filter-context.html

like image 93
Piotr Pradzynski Avatar answered Nov 09 '22 14:11

Piotr Pradzynski