Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exact matches with ElasticSearch (at query time)

I have a locations index, which has lots of location names and their respective countries.

I then want to know whether we have locations with title "Berlin" in the country with country code "DE".

Here's my Java code attempt:

SearchResponse response = client.prepareSearch("locations")
                .setQuery(QueryBuilders.matchQuery("title", "Berlin"))
                .setFilter(FilterBuilders.termFilter("country", "DE"))
                .execute()
                .actionGet();

But this gives me too many replies, e.g. results for "Zoo Berlin" and so on. I need exact matches.

(But please note that I have other scenarios where this substring/text search matching is desired.)

Is there a way to decide at querying time, rather than at indexing time which behaviour (exact vs. analyzed text) one wants?

like image 450
Michael Junk Avatar asked Aug 23 '13 12:08

Michael Junk


2 Answers

Index the field you perform a term filter on as not_analyzed. For example, you can index the "country" field as a multi_field, with one of the sub-fields not_analyzed:

        "country": {
            "type": "multi_field",
            "fields": {
                "country": {"type": "string", "index": "analyzed"},
                "exact": {"type": "string","index": "not_analyzed"}
            }
        }

Additionally, you could do the same with the "title" field in order to perform a term query:

        "title": {
            "type": "multi_field",
            "fields": {
                "title": {"type": "string", "index": "analyzed"},
                "exact": {"type": "string","index": "not_analyzed"}
            }
        }

Then at query time, if you want a title with the exact term "Berlin" filtered by the exact term "DE", use a term query and term filter with the not_analyzed fields:

SearchResponse response = client.prepareSearch("locations")
                .setQuery(QueryBuilders.termQuery("title.exact", "Berlin"))
                .setFilter(FilterBuilders.termFilter("country.exact", "DE"))
                .execute()
                .actionGet();

Note that term filters and term queries require not_analyzed fields to work (i.e., to return exact matches).

like image 79
Scott Rice Avatar answered Sep 28 '22 05:09

Scott Rice


With Version 5 + on ElasticSearch there is no concept of analyzed and not analyzed for index, its driven by type !

String data type is deprecated and is replaced with text and keyword, so if your data type is text it will behave like string and can be analyzed and tokenized.

But if the data type is defined as keyword then automatically its NOT analyzed, and return full exact match.

SO you should remember to mark the type as keyword when you want to do exact match.

and you can use the same term query and term filter as explained by @Scott Rice.

code example below for creating index with this definition, note that i have created two types for each field one as tokenizable so type is text and other one exact so type is keyword, some times its useful to keep both for certain fields:

PUT testindex
{
    "mappings": {
      "original": {
        "properties": {
          "@timestamp": {
            "type": "date"
          },
          "@version": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "APPLICATION": {
            "type": "text",
            "fields": {
                "token": {"type": "text"},
                "exact": {"type": "keyword"}
            }
          },
          "type": {
            "type": "text",
            "fields": {
                "token": {"type": "text"},
                "exact": {"type": "keyword"}
            }
          }
        }
      }
    }
  }
like image 36
Dean Jain Avatar answered Sep 28 '22 07:09

Dean Jain