I have the following query. I'm trying to find values of 'hello world', but it returns zero results. However, when <code>value = 'hello*'</code>, it does give me that expected result. Any idea how I can change my query to give me that hello world result? I've tried <code>*hello world*</code>, but for some reason it just won't search anything with spaces. I think it has something to do with the spaces as when I try to search <code>"* *"</code>, it gives me no results. But I know I have many values in there with spaces. Any ideas would help! <pre class="prettyprint lang-json prettyprint-override"><code> { "query": { "filtered": { "filter": { "and": [ { "terms": { "variant": [ "collection" ] } } ] }, "query": { "wildcard": { "name": { "value": "hello world" } } } } } } </code></pre>

What is the mapping you have used for your field <code>name</code>? If you have not defined any mapping or you have just defined the type as string (without any analyzer) then the field will be analyzed using the standard analyzer. This will create tokens as "hello" and "world" separately. This means wildcard query will work for something like <code>*ell*</code> or <code>*wor*</code> but not with spaces. You have to change your mapping to have the field "name" as not_analyzed then wildcard searches with space will work. A word of caution: Wildcard searches are heavy. If you want to do partial matching search (equivalent of %like%) You can use ngram token filter in your analyzer and do term search. It will take care of matching partial string and have better performance too.

You need to use <pre class="prettyprint"><code>match_phrase: {"field_name": "some phrase with spaces"} </code></pre> As mentioned in the official docs, <blockquote> To perform a phrase search rather than matching individual terms, you use match_phrase instead of match </blockquote>

The "string" type is legacy and with index "not_analyzed" it is mapped to the type "keyword" which is not divided into substrings. I had problems with queries including spaces before though and solved it by splitting the query in substrings at the blank spaces and making a combined query, adding a wildcard-object for every substring, using "bool" and "must": <pre class="prettyprint"><code>{ "query": { "bool": { "must": [ { "wildcard": { "name": "*hello*" } }, { "wildcard": { "name": "*world*" } } ] } } } </code></pre> This method has the small drawback that "hell world!" and other unexpected strings end up in your result. You could solve that by changing "wildcard" to "match" for all but the last substring. You should try to solve it by first changing the type of the field: <pre class="prettyprint"><code>PUT your_index { "mappings": { "your_index": { "properties": { "your_field1": { "type": "keyword" }, "your_field2": { "type": "string", "index": "not_analyzed" } } } } } } </code></pre>

Elastic Search wildcard search with spaces

Tags:

elasticsearch

wildcard

spaces

I have the following query. I'm trying to find values of 'hello world', but it returns zero results. However, when value = 'hello*', it does give me that expected result. Any idea how I can change my query to give me that hello world result? I've tried *hello world*, but for some reason it just won't search anything with spaces.

I think it has something to do with the spaces as when I try to search "* *", it gives me no results. But I know I have many values in there with spaces. Any ideas would help!

 {
  "query": {
    "filtered": {
      "filter": {
        "and": [
          {
            "terms": {
              "variant": [
                "collection"
              ]
            }
          }
        ]
      },
      "query": {
        "wildcard": {
          "name": {
            "value": "hello world"
          }
        }
      }
    }
  }
}

599

asked May 08 '15 00:05

user1530318

3 Answers

What is the mapping you have used for your field name? If you have not defined any mapping or you have just defined the type as string (without any analyzer) then the field will be analyzed using the standard analyzer. This will create tokens as "hello" and "world" separately. This means wildcard query will work for something like *ell* or *wor* but not with spaces.

You have to change your mapping to have the field "name" as not_analyzed then wildcard searches with space will work.

A word of caution: Wildcard searches are heavy. If you want to do partial matching search (equivalent of %like%) You can use ngram token filter in your analyzer and do term search. It will take care of matching partial string and have better performance too.

132

answered Oct 02 '22 12:10

Prabin Meitei

You need to use

match_phrase: {"field_name": "some phrase with spaces"}

As mentioned in the official docs,

To perform a phrase search rather than matching individual terms, you use match_phrase instead of match

answered Oct 02 '22 10:10

max

The "string" type is legacy and with index "not_analyzed" it is mapped to the type "keyword" which is not divided into substrings. I had problems with queries including spaces before though and solved it by splitting the query in substrings at the blank spaces and making a combined query, adding a wildcard-object for every substring, using "bool" and "must":

{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "name": "*hello*"
          }
        },
        {
          "wildcard": {
            "name": "*world*"
          }
        }
      ]
    }
  }
}

This method has the small drawback that "hell world!" and other unexpected strings end up in your result. You could solve that by changing "wildcard" to "match" for all but the last substring.

You should try to solve it by first changing the type of the field:

PUT your_index
{
  "mappings": {
    "your_index": {
      "properties": {
        "your_field1": {
           "type": "keyword"
            },
        "your_field2": {
            "type": "string",
            "index": "not_analyzed"
            }
         }
      }
    }
  }
}

answered Oct 02 '22 11:10

hashten

Related questions
                            
                                elasticsearch group-by multiple fields
                            
                                Elasticsearch - Bootstrap checks failing
                            
                                Build a Kibana Histogram with buckets dynamically created by ElasticSearch terms aggregation
                            
                                Importing and updating data in Elasticsearch
                            
                                query malformed, no start_object after query name
                            
                                Mimic Elasticsearch MatchQuery
                            
                                ElasticSearch sort order for multiple fields
                            
                                Elasticsearch "starts with" first word in phrases
                            
                                How to stop logstash from creating a default mapping in ElasticSearch
                            
                                What is the ElasticSearch equivalent for an SQL subquery?
                            
                                ElasticSearch Spring-Data Date format always is long
                            
                                How to index a pdf file in Elasticsearch 5.0.0 with ingest-attachment plugin?
                            
                                Remote GUI client for elastic search [closed]
                            
                                Combine elasticsearch date range queries with null values
                            
                                Can we do a bulk index without specifying a document ID for Elasticsearch?
                            
                                Creating an ElasticSearch query to search all fields and use partial matching at the same time
                            
                                Elasticsearch Rest Client with Spring Data Elasticsearch
                            
                                How can I run script automatically after Docker container startup
                            
                                Primary shard is not active or isn't assigned is a known node ?
                            
                                ElasticSearch Pagination & Sorting

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With