Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elastic Search wildcard search with spaces

I have the following query. I'm trying to find values of 'hello world', but it returns zero results. However, when value = 'hello*', it does give me that expected result. Any idea how I can change my query to give me that hello world result? I've tried *hello world*, but for some reason it just won't search anything with spaces.

I think it has something to do with the spaces as when I try to search "* *", it gives me no results. But I know I have many values in there with spaces. Any ideas would help!

 {
  "query": {
    "filtered": {
      "filter": {
        "and": [
          {
            "terms": {
              "variant": [
                "collection"
              ]
            }
          }
        ]
      },
      "query": {
        "wildcard": {
          "name": {
            "value": "hello world"
          }
        }
      }
    }
  }
}
like image 599
user1530318 Avatar asked May 08 '15 00:05

user1530318


People also ask

Does wildcard include space?

Wildcard Characters* matches zero or more non-space characters.

What is wildcard in elastic search?

A wildcard operator is a placeholder that matches one or more characters. For example, the * wildcard operator matches zero or more characters. You can combine wildcard operators with other characters to create a wildcard pattern.

How do you do a wildcard search in Kibana?

There are two wildcard expressions you can use in Kibana – asterisk (*) and question mark (?). * matches any character sequence (including the empty one) and ? matches single characters.

What is wildcard query?

Wildcards are special characters that can stand in for unknown characters in a text value and are handy for locating multiple items with similar, but not identical data. Wildcards can also help with getting data based on a specified pattern match.


3 Answers

What is the mapping you have used for your field name? If you have not defined any mapping or you have just defined the type as string (without any analyzer) then the field will be analyzed using the standard analyzer. This will create tokens as "hello" and "world" separately. This means wildcard query will work for something like *ell* or *wor* but not with spaces.

You have to change your mapping to have the field "name" as not_analyzed then wildcard searches with space will work.

A word of caution: Wildcard searches are heavy. If you want to do partial matching search (equivalent of %like%) You can use ngram token filter in your analyzer and do term search. It will take care of matching partial string and have better performance too.

like image 132
Prabin Meitei Avatar answered Oct 02 '22 12:10

Prabin Meitei


You need to use

match_phrase: {"field_name": "some phrase with spaces"}

As mentioned in the official docs,

To perform a phrase search rather than matching individual terms, you use match_phrase instead of match

like image 27
max Avatar answered Oct 02 '22 10:10

max


The "string" type is legacy and with index "not_analyzed" it is mapped to the type "keyword" which is not divided into substrings. I had problems with queries including spaces before though and solved it by splitting the query in substrings at the blank spaces and making a combined query, adding a wildcard-object for every substring, using "bool" and "must":

{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "name": "*hello*"
          }
        },
        {
          "wildcard": {
            "name": "*world*"
          }
        }
      ]
    }
  }
}

This method has the small drawback that "hell world!" and other unexpected strings end up in your result. You could solve that by changing "wildcard" to "match" for all but the last substring.

You should try to solve it by first changing the type of the field:

PUT your_index
{
  "mappings": {
    "your_index": {
      "properties": {
        "your_field1": {
           "type": "keyword"
            },
        "your_field2": {
            "type": "string",
            "index": "not_analyzed"
            }
         }
      }
    }
  }
}
like image 44
hashten Avatar answered Oct 02 '22 11:10

hashten