Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I specify a different analyzer at query time with Elasticsearch?

I would like to use a different analyzer at query time to compose my query.

I read that is possible from the documentation "Controlling Analysis" :

[...] the full sequence at search time:

  • The analyzer defined in the query itself, else
  • The search_analyzer defined in the field mapping, else
  • The analyzer defined in the field mapping, else
  • The analyzer named default_search in the index settings, which defaults to
  • The analyzer named default in the index settings, which defaults to
  • The standard analyzer

But i don't know how to compose the query in order to specify different analyzers for different clauses:

"query"  => [
    "bool" => [
        "must"   => [
            {
                "match": ["my_field": "My query"]
                "<ANALYZER>": <ANALYZER_1>
            }
        ],
        "should" => [
            {
                "match": ["my_field": "My query"]
                "<ANALYZER>": <ANALYZER_2>    
            }
        ]
    ]
]

I know that i can index two or more different fields, but I have strong secondary memory constraints and I can't index the same information N times.

Thank you

like image 481
Luca Mastrostefano Avatar asked Jun 10 '16 17:06

Luca Mastrostefano


People also ask

How do I create a custom analyzer?

For custom analyzers, use a type of custom or omit the type parameter. The previous example used tokenizer, token filters, and character filters with their default configurations, but it is possible to create configured versions of each and to use them in a custom analyzer.

What is the default analyzer Elasticsearch?

By default, Elasticsearch uses the standard analyzer for all text analysis. The standard analyzer gives you out-of-the-box support for most natural languages and use cases. If you chose to use the standard analyzer as-is, no further configuration is needed.

What is the use of analyzer in Elasticsearch?

In a nutshell an analyzer is used to tell elasticsearch how the text should be indexed and searched. And what you're looking into is the Analyze API, which is a very nice tool to understand how analyzers work. The text is provided to this API and is not related to the index.


1 Answers

If you haven't yet, you first need to map the custom analyzers to your index settings endpoint.

Note: if the index exists and is running, make sure to close it first.

POST /my_index/_close

Then map the custom analyzers to the settings endpoint.

PUT /my_index/_settings
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_analyzer1": { 
          "type": "standard",
          "stopwords_path": "stopwords/stopwords.txt"
        },
        "custom_analyzer2": { 
          "type": "standard",
          "stopwords": ["stop", "words"]
        }
      }
    }
  }
}

Open the index again.

POST /my_index/_open

Now you can query your index with the new analyzers.

GET /my_index/_search
{
  "query": {
    "bool": {
      "should": [{
        "match": {
          "field_1": {
            "query": "Hello world",
            "analyzer": "custom_analyzer1"
          }
        }
      }],
      "must": [{
        "match": {
          "field_2": {
            "query": "Stop words can be tough",
            "analyzer": "custom_analyzer2"
          }
        }
      }]
    }
  }
}
like image 194
corin123 Avatar answered Sep 23 '22 01:09

corin123