Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch match an array of strings

My Elasticsearch (v5.4.1) documents have a _patents field as such :

{
    // (Other fields : title, text, date, etc.)
    ,
    "_patents": [
        {"cc": "US"},
        {"cc": "MX"},
        {"cc": "KR"},
        {"cc": "JP"},
        {"cc": "CN"},
        {"cc": "CA"},
        {"cc": "AU"},
        {"cc": "AR"}
    ]
}

I'm trying to build a query that would return only documents whose patents match an array of country codes. For instance, if my filter is ["US","AU"] I need to be returned all documents that have patents in US and in AU. Exclude documents that have US but not AU.

So far I have tried to add a "term" field to my current working query :

{
    "query": {
        "bool": {
            "must": [
                // (Other conditions here : title match, text match, date range, etc.) These work
                 ,
                {
                    "terms": {
                        "_patents.cc": [ // I tried just "_patents"
                            "US",
                            "AU"
                        ]
                    }
                }
            ]
        }
    }
}

Or this, as a filter :

{
    "query": {
        "bool": {
            "must": [...],
            "filter": {
                "terms": {
                    "_patents": [
                        "US",
                        "AU"
                    ]
                }
            }
        }
    }
}

These queries and the variants I've tried don't produce an error, but return 0 result.

I can't change my ES document model to something easier to match, like "_patents": [ "US","CA", "AU", "CN", "JP" ] because this is a populated field. At indexation time, I populate and reference Patent documents that have many fields, including cc.

like image 814
Jeremy Thille Avatar asked Jun 16 '17 07:06

Jeremy Thille


3 Answers

I found the solution. The filtered country names have to be lowercase...

"US" returns no result, but "us" works, despite the indexed field being "US" ...... Faint -_-'

I also wrote the query this way :

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "_patents.cc": "us"
          }
        },
        {
          "term": {
            "_patents.cc": "ca"
          }
        }
      ]
    }
  }
}  
like image 184
Jeremy Thille Avatar answered Nov 17 '22 16:11

Jeremy Thille


This works for Uppercase and lowercase both..

 {
  "query": {
    "bool": {
      "must": [ 
        {
          "match": {
            "_patents.cc": "au"
          }
        },
        {
          "match": {
            "_patents.cc": "us"
          }
        }
      ]
    }
  }
}
like image 25
Rishi Pandey Avatar answered Nov 17 '22 16:11

Rishi Pandey


My version of elasticsearch Version is 6.0.1. I am using this approach:

GET <your index>/_search
{
  "query": {
    "bool": {
      "must": [{
        "query_string": {
          "query": "cc:us OR cc:ca"
        }
      }]
    }    
  }
}
like image 27
1nstinct Avatar answered Nov 17 '22 16:11

1nstinct