Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch, match any possible exact value of an Array in an Array

I want to query for all the documents that match any of the following values:

["Test","Cat","Dog"]

in the field categories.

I have the following mapping:

"categories": {
    "type": "string"
}

A couple of sample documents are

"categories": [
    "Test",
    "Cat"
]

Or

"categories": [
    "Blue Cat",
    "Ball"
]

I was able to pull it off with the following query:

query: {
    match: {
        categories: {
            query: ["Test","Cat","Dog"]
        }
    }

But that would return me both Documents because they both include "Cat" even tho one of them include it in the form of "Blue Cat", how can I specify they I want the exact value "Cat" not that it includes it?

I read about changing the field type on the mapping to nested, but an array is not accepted as a nested object since it doesn't have keys and values.

If I use this mapping:

"categories": {
    "type": "nested"
}

I get this error:

"object mapping for [categories] tried to parse field [null] as object, but found a concrete value"

How can I filter by the field categories using an array of possible values and making sure it matches at least one of the values exactly?

like image 753
Danny Sandi Avatar asked Sep 19 '16 21:09

Danny Sandi


1 Answers

Change the field to be "not_analyzed". Right now its using a default "standard" analyzer which will split "Blue Cat" into two tokens "Blue" and "Cat", and thats why your query matches the doc containing "Blue Cat".

Here is the mapping

{
"categories": {
    "type":     "string",
    "index":    "not_analyzed"
}}

I indexed two documents using the above mapping.

{
_index : "test_index",
_type : "test",
_id : "2",
_score : 1,
_source : {
    categories : [
        "Blue Cat",
        "Ball"
    ]
}}, {
_index : "test_index",
_type : "test",
_id : "1",
_score : 1,
_source : {
    categories : [
        "Test",
        "Cat"]
}}]}

I searched using the below template

{
"query" : {
    "constant_score" : {
        "filter" : {
            "terms" : { 
                "categories" :["Test","Cat","Dog"]
            }
        }
    }
}}

I get back only the second document

{
"took" : 9,
"timed_out" : false,
"_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
},
"hits" : {
    "total" : 1,
    "max_score" : 1,
    "hits" : [{
            "_index" : "test_index",
            "_type" : "test",
            "_id" : "1",
            "_score" : 1,
            "_source" : {
                "categories" : [
                    "Test",
                    "Cat"
                ]
            }
        }
    ]
}}
like image 167
jay Avatar answered Oct 16 '22 03:10

jay