Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make exact values and match queries on same field in elasticsearch?

So I have a field that stores a value in the format: number/year, something like 23/2014, 24/2014, 12/2015, etc...

so if this field is mapped as a not_analyzed one, I can make exact value searches with term filter, if I search for a value in that exact structure(something like 1/2014, 15/2014,...) it works, like the sql equals(=).

{
  "query": {
    "filtered": {
      "filter": {
        "term": {
          "processNumber": "11/2014"
        }
      }
    }
  }
}

So, searching with something different like 11/, or /2014 wouldn't return hits. This is fine.

But if I define the field as not_analyzed, I can't make sql LIKE type searches with the match_phrase query.

{
  "query": {
    "match_phrase": {
      "processNumber": "11/201"
    }
  }
}

In this case searching for 11,11/,/2014 or 2014 should return hits, but they don't. The thing is, this query works if the field is not mapped as a not_analyzed one. So it seems I have to either use one or the other, the problem is that the field should support both options for different queries, am I missing something here?

like image 921
Maxrunner Avatar asked Nov 13 '14 16:11

Maxrunner


1 Answers

You can analyze the same field processNumber in different ways using the fields property in the mapping:

For example if you want the analyzed and unanalyzed version of ProcessNumber the mapping would be :

 {
   "type_name": {
      "properties": {
         "processNumber": {
            "type": "string",
            "index": "not_analyzed",
            "fields": {
               "analyzed": {
                  "type": "string",
                  "index": "analyzed"
               }
            }
         }
      }
   }
}

Where the not-analyzed field is referred in query as processNumber .

To refer to the analyzed view of the field use processNumber.analyzed

The queries for terms 11/201, 11 etc would be :

Example Filter:

 { "query" : { "filtered" : { "filter" : { "term" : { "processNumber" : "11/2014" } } } } }

Term filter it does not analyze the search string so an input would be matched as it is with the fields inverted index in this case : 11/2014 against the field.

Example Match_Phrase_prefix:

{ "query": { "match_phrase_prefix": { "processNumber": "11/201" } } }

match_phrase_prefix tries to check if the last term in the phrase is a prefix of terms in index . It analyzes the search string if an analyzer is specified. This is the reason you need to use the unanalyzed version of the field here . If we use processNumber.analyzed search queries such as 11-201 , 11|201 would also match

example match :

  { "query": { "match": { "processNumber.analyzed": "11" } } }

This is straight forward match since default analyzer (usually standard analyzer) will tokenize 11/2014 to terms 11, 2014 .

You can use the analyze api to see how a particular text gets analyzed by default analyzer.

curl -XPOST "http://<machine>/_analyze?text=11/2014"
like image 189
keety Avatar answered Sep 18 '22 13:09

keety