Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elastic Search - Failed To Create Query

I've encountered a problem when accessing inner_hits data using the Python elastic search. I'm getting

RequestError(400,'search_phase_execution_exception', 'failed to create query'

error when I'm trying to use inner_hits{}.
My elastic search version 6.5.4, python version 3.7.2.

from elasticsearch import Elasticsearch
es = Elasticsearch()


mapping = '''{
        "mappings": {
    "tablets": {
      "properties": {
        "Names": {
          "type": "nested"
          "properties":{
              "ID": {"type" : "long"},
              "Combination": {"type" : "text"},
              "Synonyms": {"type" : "text"}
          }
        }
      }
    }
  }
}'''

es.indices.create(index="2", ignore=400, body=mapping)

tablets = {
    "Names":[
    {
    "ID" : 1,    
    "Combination": "Paracetamol",
    "Synonyms": "Crocin"
    },{
    "ID" : 2,
    "Combination": "Pantaprazole",
    "Synonyms": "Pantap"
    }]}

res = es.index(index="2", doc_type='json', id=1, body=tablets)

z = "patient took Pantaprazole."



res= es.search(index='2',body=
{
  "query": {
    "nested": {
      "path": "Names",
      "query": {
        "match": {"Names.Combination" : z}
      },
      "inner_hits": {} 
    }
  }
})
print(res)

Output---------------------------------------------------

    "inner_hits": {}
      File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\client\utils.py", line 76, in _wrapped
        return func(*args, params=params, **kwargs)
      File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\client\__init__.py", line 660, in search
        doc_type, '_search'), params=params, body=body)
      File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\transport.py", line 318, in perform_request
        status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
      File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\connection\http_urllib3.py", line 186, in perform_request
        self._raise_error(response.status, raw_data)
      File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\connection\base.py", line 125, in _raise_error
        raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
    elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'failed to create query: {\n  "nested" : {\n    "query" : {\n
     "match" : {\n        "Names.Combination" : {\n          "query" : "patient took Pantaprazole.",\n          "operator" : "OR",\n          "prefix_length" : 0,\n          "max_expansions" : 50,\n          "fuzzy_transpositions" : true,\n          "lenient" : false,\n          "zero_terms_query" : "NONE",\n          "auto_generate_synonyms_phrase_query" : true,\n          "boost" : 1.0\n        }\n      }\n    },\n    "path" : "Names",\n    "ignore_unmapped" : false,\n    "score_mode" : "avg",\n    "boost" : 1.0,\n    "inner_hits" : {\n      "ignore_unmapped" : false,\n      "from" : 0,\n      "size" : 3,\n      "version" : false,\n      "explain" : false,\n      "track_scores" : false\n    }\n  }\n}')
like image 517
surya aravind Avatar asked Jan 26 '23 18:01

surya aravind


1 Answers

Thanks for posting your code exactly as you run it, and in the way it can be copy-pasted and run. It really helps a lot.

There was a comma missing in the JSON of your mapping, but the error was ignored because you set ignore="400".

Here's how the fixed script should look like:

import time

from elasticsearch import Elasticsearch
es = Elasticsearch()

# fix typo - missing comma after "nested"
mapping = '''{
"mappings": {
    "tablets": {
      "properties": {
        "Names": {
          "type": "nested",
          "properties":{
              "ID": {"type" : "long"},
              "Combination": {"type" : "text"},
              "Synonyms": {"type" : "text"}
          }
        }
      }
    }
  }
}'''

# remove ignore="400"
es.indices.create(index="2", body=mapping)

tablets = {
    "Names": [
        {
            "ID": 1,
            "Combination": "Paracetamol",
            "Synonyms": "Crocin"
        }, {
            "ID": 2,
            "Combination": "Pantaprazole",
            "Synonyms": "Pantap"
        }
    ]
}

We also need to set doc_type to the one declared in the mapping:

# set doc_type to 'tablets' since this is what we defined in mapping
res = es.index(index="2", doc_type='tablets', id=1, body=tablets)

z = "patient took Pantaprazole."

# allow Elasticsearch to refresh data so it is searchable
time.sleep(2)

res= es.search(index='2',body=
{
  "query": {
    "nested": {
      "path": "Names",
      "query": {
        "match": {"Names.Combination" : z}
      },
      "inner_hits": {}
    }
  }
})
print(res)

That's it! The output of the script will look like:

{'took': 7, 'timed_out': False, '_shards': {'total': 5, 'successful': 5, 'skipped': 0, 'failed': 0}, 'hits': {'total': 1, 'max_score': 0.6931472, 'hits': [{'_index': '2', '_type': 'tablets', '_id': '1', '_score': 0.6931472, '_source': {'Names': [{'ID': 1, 'Combination': 'Paracetamol', 'Synonyms': 'Crocin'}, {'ID': 2, 'Combination': 'Pantaprazole', 'Synonyms': 'Pantap'}]}, 'inner_hits': {'Names': {'hits': {'total': 1, 'max_score': 0.6931472, 'hits': [{'_index': '2', '_type': 'tablets', '_id': '1', '_nested': {'field': 'Names', 'offset': 1}, '_score': 0.6931472, '_source': {'ID': 2, 'Combination': 'Pantaprazole', 'Synonyms': 'Pantap'}}]}}}}]}}

Why did I get that error message about failed to create query?

Elasticsearch raised an error failed to create query because it didn't manage to create a nested query against a non-nested field.

The field was supposed to be nested, why wasn't it so?

There was a typo in the mapping, a comma missing. Elasticsearch failed to put the mapping. Why didn't the script fail?

Because in the Python call to es.indices.create() the ignore="400" parameter was set, which made the Python Elasticsearch client ignore the HTTP 400 response code, which in turn corresponds to "malformed data error".

So why did Elasticsearch allow you to do other queries, like indexing of the documents and searching?

Because by default Elasticsearch will not require mapping and will infer it from the structure of the documents. This is called dynamic mapping.

like image 161
Nikolay Vasiliev Avatar answered Feb 04 '23 01:02

Nikolay Vasiliev