Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exact (not substring) matching in Elasticsearch

{"query":{ "match" : { "content" : "2" } }} matches all the documents whole content contains the number 2, however I would like the content to be exactly 2, no more no less - think of my requirement in a spirit of Java's String.equals.

Similarly for the second query I would like to match when the document's content is exactly '3 3' and nothing more or less. {"query":{ "match" : { "content" : "3 3" } }}

How could I do exact (String.equals) matching in Elasticsearch?

like image 327
Dávid Natingga Avatar asked Aug 08 '13 23:08

Dávid Natingga


2 Answers

Without seeing your index type mapping and sample data, it's hard to answer this directly - but I'll try.

Offhand, I'd say this is similar to this answer here (https://stackoverflow.com/a/12867852/382774), where you simply set the content field's index option to not_analyzed in your mapping:

"url" : {
    "type" : "string", 
    "index" : "not_analyzed"
}

Edit: I wasn't clear enough with my original answer, shown above. I did not mean to imply that you should add the example code to your query, I meant that you need to specify in your index type mapping that the url field is of type string and it is indexed but not analyzed (not_analyzed).

This tells Elasticsearch to not bother analyzing (tokenizing or token filtering) the field when you're indexing your documents - just store it in the index as it exists in the document. For more information on mappings, see http://www.elasticsearch.org/guide/reference/mapping/ for an intro and http://www.elasticsearch.org/guide/reference/mapping/core-types/ for specifics on not_analyzed (tip: search for it on that page).

Update:

Official doc tells us that in a new version of Elastic search you can't define variable as "not_analyzed", instead of this you should use "keyword".

For the old version elastic:

{
  "foo": {
    "type" "string",
    "index": "not_analyzed"
  }

}

For new version:

{
  "foo": {
    "type" "keyword",
    "index": true
  }
}

Note that this functionality (keyword type) are from elastic 5.0 and backward compatibility layer is removed from Elasticsearch 6.0 release.

like image 163
James Addison Avatar answered Oct 11 '22 16:10

James Addison


Official Doc

You should use filter instead of match.

{
"query" : {
    "constant_score" : { 
        "filter" : {
            "term" : { 
                "content" : 2
            }
        }
    }
}

And you got docs whose content is exact 2, instead of 20 or 2.1

like image 26
YLS Avatar answered Oct 11 '22 18:10

YLS