So in DB I have this entry: <pre class="prettyprint"><code>Mark-Whalberg </code></pre> When searching with term <pre class="prettyprint"><code>Mark-Whalberg </code></pre> I get not match. Why? Is minus a special character what I understand? It symbolizes "exclude"? The query is this: {"query_string": {"query": 'Mark-Whalberg', "default_operator": "AND"}} Searching everything else, like: <pre class="prettyprint"><code>Mark Whalberg hlb Mark Whalberg </code></pre> returns a match. Is this stored as two different pieces? How can I get a match when including the minus sign in the search term? --------------EDIT-------------- This is the current query: <pre class="prettyprint"><code>var fields = [ "field1", "field2", ]; {"query_string":{"query": '*Mark-Whalberg*',"default_operator": "AND","fields": fields}}; </code></pre>

<h3>You have an analyzer configuration issue.</h3> Let me explain that. When you defined your index in ElasticSearch, you didn't indicate any analyzer for the field. It means it's the Standard Analyzer that will apply. According to the documentation : <blockquote> Standard Analyzer The standard analyzer is the default analyzer which is used if none is specified. It provides grammar based tokenization (based on the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29) and works well for most languages. </blockquote> Also, to answer to your question : <blockquote> Why? Is minus a special character what I understand? It symbolizes "exclude"? </blockquote> For the Standard Analyzer, yes it is. It doesn't mean "exclude" but it is a special char that will be deleted after analysis. From documentation : Why doesn’t the term query match my document? <blockquote> [...] There are many ways to analyze text: the default standard analyzer drops most punctuation, breaks up text into individual words, and lower cases them. For instance, the standard analyzer would turn the string “Quick Brown Fox!” into the terms [quick, brown, fox]. [...] </blockquote> Example : If you have the following text : <pre class="prettyprint"><code>"The 2 QUICK Brown-Foxes jumped over the lazy dog's bone." </code></pre> Then the Standard Analyzer will produce : <pre class="prettyprint"><code>[ the, 2, quick, brown, foxes, jumped, over, the, lazy, dog's, bone ] </code></pre> If you don't want to use the analyzer you have 2 solutions : <ul> <li>You can use match query.</li> <li>You can ask ElasticSearch to not analyze the field when you create your index : here's how </li> </ul> I hope this will help you.

Match string with minus character in elasticsearch

Tags:

elasticsearch

So in DB I have this entry:

Mark-Whalberg

When searching with term

Mark-Whalberg

I get not match.

Why? Is minus a special character what I understand? It symbolizes "exclude"?

The query is this:

{"query_string": {"query": 'Mark-Whalberg', "default_operator": "AND"}}

Searching everything else, like:

Mark
Whalberg
hlb
Mark Whalberg

returns a match.

Is this stored as two different pieces? How can I get a match when including the minus sign in the search term?

--------------EDIT--------------

This is the current query:

var fields = [
    "field1",
    "field2",
];

{"query_string":{"query": '*Mark-Whalberg*',"default_operator": "AND","fields": fields}};

670

asked May 18 '17 09:05

oderfla

2 Answers

You have an analyzer configuration issue.

Let me explain that. When you defined your index in ElasticSearch, you didn't indicate any analyzer for the field. It means it's the Standard Analyzer that will apply.

According to the documentation :

Standard Analyzer

The standard analyzer is the default analyzer which is used if none is specified. It provides grammar based tokenization (based on the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29) and works well for most languages.

Also, to answer to your question :

Why? Is minus a special character what I understand? It symbolizes "exclude"?

For the Standard Analyzer, yes it is. It doesn't mean "exclude" but it is a special char that will be deleted after analysis.

From documentation :

Why doesn’t the term query match my document?

[...] There are many ways to analyze text: the default standard analyzer drops most punctuation, breaks up text into individual words, and lower cases them. For instance, the standard analyzer would turn the string “Quick Brown Fox!” into the terms [quick, brown, fox]. [...]

Example :

If you have the following text :

"The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."

Then the Standard Analyzer will produce :

[ the, 2, quick, brown, foxes, jumped, over, the, lazy, dog's, bone ]

If you don't want to use the analyzer you have 2 solutions :

You can use match query.
You can ask ElasticSearch to not analyze the field when you create your index : here's how

I hope this will help you.

182

answered Sep 28 '22 09:09

Mickael

I've stuck in same question and the answer from @Mickael was perfect to understand what is going on (I really recommend you to read the linked documentation).

I solve this by defining an operator to the query:

GET http://localhost:9200/creative/_search

{  
  "query": {
    "match": {
      "keyword_id": {
        "query": "fake-keyword-uuid-3",
        "operator": "AND"
       }
    }
  }
}

For better understand the algorithm that this query uses, try to add "explain": true and analyse the results:

GET http://localhost:9200/creative/_search

{  
  "explain": true,
  "query": // ...
}

answered Sep 28 '22 07:09

Abe

Related questions
                            
                                Connection refused - connect(2) for "localhost" port 9200 with DigitalOcean
                            
                                Error when trying to use Elasticsearch Transport Client: dependencies not loaded to class path
                            
                                Matching arrays in elastic search
                            
                                Elasticsarch C# Nest [5.x] attributes
                            
                                What is the best way to sync Postgres and ElasticSearch?
                            
                                Saving date in microsecond format in ElasticSearch
                            
                                Kibana Windows zip distribution takes too long to unzip
                            
                                How to get specific _source fields in aggregation
                            
                                Kibana FATAL Error: [elasticsearch.url]: definition for this key is missing
                            
                                What are the default ports of ELK Stack services?
                            
                                How to query for a specific document by _id using the elasticsearch Nest client
                            
                                Appending multiple bool filters to a NEST query
                            
                                Difference between XPOST and XPUT
                            
                                Elasticsearch filter by year from a date field
                            
                                ElasticSearch 2.0 upgrade now can't connect to server
                            
                                elasticsearch can't start service in ubuntu 15.10
                            
                                How to provide highlighting with Spring data elasticsearch
                            
                                Can we use elastic search as a cache for fast retrieval of data?
                            
                                Connecting to AWS Elasticsearch from non-AWS node.js app
                            
                                How to check refresh interval in elastic search if it is not default

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With