Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Term, nested documents and must_not query incompatible in ElasticSearch?

I have trouble combining term, must_not queries on nested documents.

Sense example can be found here : http://sense.qbox.io/gist/be436a1ffa01e4630a964f48b2d5b3a1ef5fa176

Here my mapping :

{
    "mappings": {
        "docs" : {
            "properties": {
                "tags" : {
                    "type": "nested",
                    "properties" : {
                        "type": {
                           "type": "string",
                           "index": "not_analyzed"
                        }
                    }
                },
                "label" : {
                    "type": "string"
                }
            }
        }
    }
}

with two documents in this index :

{
    "tags" : [
        {"type" : "POST"},
        {"type" : "DELETE"}
    ],
    "label" : "item 1"
},
{
    "tags" : [
        {"type" : "POST"}
    ],
    "label" : "item 2"
}

When I query this index like this :

{
  "query": {
    "nested": {
      "path": "tags",
      "query": {
        "bool": {
          "must": {
            "term": {
              "tags.type": "DELETE"
            }
          }
        }
      }
    }
  }
}

I've got one hit (which is correct)

When I want to get documents WHICH DON'T CONTAIN the tag "DELETE", with this query :

{
  "query": {
    "nested": {
      "path": "tags",
      "query": {
        "bool": {
          "must_not": {
            "term": {
              "tags.type": "delete"
            }
          }
        }
      }
    }
  }
}

I've got 2 hits (which is incorrect). This issue seems very close to this one (Elasticsearch array must and must_not) but it's not...

Can you give me some clues to resolve this issue ?

Thank you

like image 766
user3393203 Avatar asked Mar 07 '14 15:03

user3393203


2 Answers

Your original query would search in each individual nested object and eliminate the objects that don't match, but if there are some nested objects left, they do match with your query and so you get your results. This is because nested objects are indexed as a hidden separate document

Original code:

{
  "query": {
    "nested": {
      "path": "tags",
      "query": {
        "bool": {
          "must_not": {
            "term": {
              "tags.type": "delete"
            }
          }
        }
      }
    }
  }
}

The solution is then quite simple really, you should bring the bool query outside the nested documents. Now all the documents are discarded who have a nested object with the "DELETE" type. Just what you wanted!

The solution:

{
  "query": {
    "bool": {
      "must_not": {
        "nested": {
          "path": "tags",
          "query": {
            "term": {
              "tags.type": "DELETE"
            }
          }
        }
      }
    }
  }
}

NOTE: Your strings are "not analyzed" and you searched for "delete" instead of "DELETE". If you want to search case insensitive, make your strings analyzed

like image 89
Roeland Van Heddegem Avatar answered Sep 19 '22 15:09

Roeland Van Heddegem


This should fix your problem: http://sense.qbox.io/gist/f4694f542bc76c29624b5b5c9b3ecdee36f7e3ea

Two most important things:

  1. include_in_root on "tags.type". This will tell ES to index tag types as "doc.tags.types" : ['DELETE', 'POSTS'], so you can access an array of those values "flattened" on the root doc . This means you no longer need a nested query (see #2)

  2. Drop the nested query.

 

{
    "mappings": {
        "docs" : {
            "properties": {
                "tags" : {
                    "type": "nested",
                    "properties" : {
                        "type": {
                           "type": "string",
                           "index": "not_analyzed"
                        }
                    },
                    "include_in_root": true
                },
                "label" : {
                    "type": "string"
                }
            }
        }
    }
}

 

{
   "query": {
      "bool": {
         "must_not": {
            "term": {
               "tags.type": "DELETE"
            }
         }
      }
   }
}
like image 40
Ben at Qbox.io Avatar answered Sep 18 '22 15:09

Ben at Qbox.io