Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch: Difference between "Term", "Match Phrase", and "Query String"

People also ask

What is match phrase in Elasticsearch?

Match phrase queryedit The match_phrase query analyzes the text and creates a phrase query out of the analyzed text. For example: response = client.

How does Elasticsearch match query work?

The match query analyzes any provided text before performing a search. This means the match query can search text fields for analyzed tokens rather than an exact term. (Optional, string) Analyzer used to convert the text in the query value into tokens. Defaults to the index-time analyzer mapped for the <field> .

What is phrase query?

A Query that matches documents containing a particular sequence of terms. A PhraseQuery is built by QueryParser for input like "new york" . This query may be combined with other terms or queries with a BooleanQuery . NOTE: All terms in the phrase must match, even those at the same position.


term query matches a single term as it is : the value is not analyzed. So, it doesn't have to be lowercased depending on what you have indexed.

If you provided Bennett at index time and the value is not analyzed, the following query won't return anything :

{
  "query": {
    "term" : { "user" : "bennett" }
  }
}

match_phrase query will analyze the input if analyzers are defined for the queried field and find documents matching the following criteria:

  • all the terms must appear in the field
  • they must have the same order as the input value
  • there must not be any intervening terms, i.e. be consecutive (potentially excluding stop-words but this can be complicated)

For example, if you index the following documents (using standard analyzer for the field foo):

{ "foo":"I just said hello world" }

{ "foo":"Hello world" }

{ "foo":"World Hello" }

{ "foo":"Hello dear world" }

This match_phrase query will only return the first and second documents :

{
  "query": {
    "match_phrase": {
      "foo": "Hello World"
    }
  }
}

query_string query search, by default, on a _all field which contains the text of several text fields at once. On top of that, it's parsed and supports some operators (AND/OR...), wildcards and so on (see related syntax).

As the match_phrase queries, the input is analyzed according to the analyzer set on the queried field.

Unlike the match_phrase, the terms obtained after analysis don't have to be in the same order, unless the user has used quotes around the input.

For example, using the same documents as before, this query will return all the documents :

{
  "query": {
    "query_string": {
      "query": "hello World"
    }
  }
}

But this query will return the same 2 documents as the match_phrase query :

{
  "query": {
    "query_string": {
      "query": "\"Hello World\""
    }
  }
}

There is much more to say about the different options for those queries, please take a look at the related documentation :

  • term
  • match_phrase
  • query_string

Hope this is clear enough and it will help.


I think some one definitely looking for differences between them with respect to PARTIAL SEARCH Here is my analysis with default ‘standard analyzer’ :-

Suppose ,We have data :-

{ "name" : “Hello”}

Now what if we want to do partial search with ell ???

Term Query OR Match query

{"term":{"name": "*ell*" }

Will not work , return noting .

{"term":{"name": "*zz* *ell*" }

Will not work , return noting .

Conclusion - Term or Match is not able to do partial search at all

wildcard Query :-

{"wildcard":{"name": "*ell*" }

Will work give result { "name" : "Hello"}

{"wildcard":{"name": "*zz* *ell*" }

Will not work , return noting .

Conclusion - wildcard is able to do partial search with one token only

Query_string :-

{"query_string": {"default_field": "name","query": "*ell*"}

Will work give result { "name" : “Hello”}

{"query_string": {"default_field": "name","query": "*zz* *ell*" }

Will work give result { "name" : “Hello”} .

Conclusion - query_string is able to search with two token are given

-> here token are ell and zz