Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch array property must contain given array items

I have documents that look like:

{
    "tags" => [
        "tag1",
        "tag2",
    ],
    "name" => "Example 1"
}

{
    "tags" => [
        "tag1",
        "tag3",
        "tag4"
    ],
    "name" => "Example 2"
}

What I now want is to do a terms search where given array might look like:

[tag1, tag3]

where expected hit should be:

{
    "tags" => [
        "tag1",
        "tag3",
        "tag4"
    ],
    "name" => "Example 2"
}

However, when I do a query like:

GET _search
{
    "query": {
        "filtered": {
           "query": {
               "match_all": {}
           },
           "filter": {
               "bool": {
                   "must": [
                      {
                          "terms": {
                             "tags": [
                                "tag1",
                                "tag3"
                             ]
                          }
                      }
                   ]
               }
           }
       }
    }
}

I get both "Example 1" and "Example 2" as hits since both Example 1 and Example 2 contains either tag1 or tag3. By looking at the documentation for terms I figured out that terms is actually a contains query.

How can I in this case make sure that Example 2 is the only hit when querying with tag1 and tag3?

like image 398
Ekenstein Avatar asked Aug 10 '15 09:08

Ekenstein


People also ask

What is the difference between must and should in Elasticsearch?

must means: Clauses that must match for the document to be included. should means: If these clauses match, they increase the _score ; otherwise, they have no effect. They are simply used to refine the relevance score for each document. Yes you can use multiple filters inside must .

What is nested type in Elasticsearch?

The nested type is a specialised version of the object data type that allows arrays of objects to be indexed in a way that they can be queried independently of each other.

How do I write a script in Elasticsearch?

Wherever scripting is supported in the Elasticsearch APIs, the syntax follows the same pattern; you specify the language of your script, provide the script logic (or source), and add parameters that are passed into the script: "script": { "lang": "...", "source" | "id": "...", "params": { ... } }

What is elastic search?

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. Since its release in 2010, Elasticsearch has quickly become the most popular search engine and is commonly used for log analytics, full-text search, security intelligence, business analytics, and operational intelligence use cases.


1 Answers

For those who are looking at this in 2020, you might have noticed that minimum_should_match is deprecated long back.

There is an alternative currently available, which is to use terms_set.

For eg:

{
  "query": {
    "terms_set": {
      "programming_languages": {
        "terms": [ "c++", "java", "php" ],
        "minimum_should_match_field": "required_matches"
      }
    }
  }
}

The above example assumes a field required_matches exists which contains an integer, that defines how many matches should be there.

What is more useful is the alternative field minimum_should_match_script.

See the example below:

{
  "query": {
    "terms_set": {
      "programming_languages": {
        "terms": [ "c++", "java", "php" ],
        "minimum_should_match_script": {
          "source": "2"
        },
      }
    }
  }
}

You can always use the inside a filter context to make it works a filter.

Read more here

like image 68
Abdul Vajid Avatar answered Oct 09 '22 04:10

Abdul Vajid