I've encountered a problem when accessing inner_hits
data using the Python elastic search. I'm getting
RequestError(400,'search_phase_execution_exception', 'failed to create query'
error when I'm trying to use inner_hits{}
.
My elastic search version 6.5.4, python version 3.7.2.
from elasticsearch import Elasticsearch
es = Elasticsearch()
mapping = '''{
"mappings": {
"tablets": {
"properties": {
"Names": {
"type": "nested"
"properties":{
"ID": {"type" : "long"},
"Combination": {"type" : "text"},
"Synonyms": {"type" : "text"}
}
}
}
}
}
}'''
es.indices.create(index="2", ignore=400, body=mapping)
tablets = {
"Names":[
{
"ID" : 1,
"Combination": "Paracetamol",
"Synonyms": "Crocin"
},{
"ID" : 2,
"Combination": "Pantaprazole",
"Synonyms": "Pantap"
}]}
res = es.index(index="2", doc_type='json', id=1, body=tablets)
z = "patient took Pantaprazole."
res= es.search(index='2',body=
{
"query": {
"nested": {
"path": "Names",
"query": {
"match": {"Names.Combination" : z}
},
"inner_hits": {}
}
}
})
print(res)
Output---------------------------------------------------
"inner_hits": {}
File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\client\utils.py", line 76, in _wrapped
return func(*args, params=params, **kwargs)
File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\client\__init__.py", line 660, in search
doc_type, '_search'), params=params, body=body)
File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\transport.py", line 318, in perform_request
status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\connection\http_urllib3.py", line 186, in perform_request
self._raise_error(response.status, raw_data)
File "C:\Users\aravind\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\connection\base.py", line 125, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'failed to create query: {\n "nested" : {\n "query" : {\n
"match" : {\n "Names.Combination" : {\n "query" : "patient took Pantaprazole.",\n "operator" : "OR",\n "prefix_length" : 0,\n "max_expansions" : 50,\n "fuzzy_transpositions" : true,\n "lenient" : false,\n "zero_terms_query" : "NONE",\n "auto_generate_synonyms_phrase_query" : true,\n "boost" : 1.0\n }\n }\n },\n "path" : "Names",\n "ignore_unmapped" : false,\n "score_mode" : "avg",\n "boost" : 1.0,\n "inner_hits" : {\n "ignore_unmapped" : false,\n "from" : 0,\n "size" : 3,\n "version" : false,\n "explain" : false,\n "track_scores" : false\n }\n }\n}')
Thanks for posting your code exactly as you run it, and in the way it can be copy-pasted and run. It really helps a lot.
There was a comma missing in the JSON of your mapping, but the error was ignored because you set ignore="400"
.
Here's how the fixed script should look like:
import time
from elasticsearch import Elasticsearch
es = Elasticsearch()
# fix typo - missing comma after "nested"
mapping = '''{
"mappings": {
"tablets": {
"properties": {
"Names": {
"type": "nested",
"properties":{
"ID": {"type" : "long"},
"Combination": {"type" : "text"},
"Synonyms": {"type" : "text"}
}
}
}
}
}
}'''
# remove ignore="400"
es.indices.create(index="2", body=mapping)
tablets = {
"Names": [
{
"ID": 1,
"Combination": "Paracetamol",
"Synonyms": "Crocin"
}, {
"ID": 2,
"Combination": "Pantaprazole",
"Synonyms": "Pantap"
}
]
}
We also need to set doc_type
to the one declared in the mapping:
# set doc_type to 'tablets' since this is what we defined in mapping
res = es.index(index="2", doc_type='tablets', id=1, body=tablets)
z = "patient took Pantaprazole."
# allow Elasticsearch to refresh data so it is searchable
time.sleep(2)
res= es.search(index='2',body=
{
"query": {
"nested": {
"path": "Names",
"query": {
"match": {"Names.Combination" : z}
},
"inner_hits": {}
}
}
})
print(res)
That's it! The output of the script will look like:
{'took': 7, 'timed_out': False, '_shards': {'total': 5, 'successful': 5, 'skipped': 0, 'failed': 0}, 'hits': {'total': 1, 'max_score': 0.6931472, 'hits': [{'_index': '2', '_type': 'tablets', '_id': '1', '_score': 0.6931472, '_source': {'Names': [{'ID': 1, 'Combination': 'Paracetamol', 'Synonyms': 'Crocin'}, {'ID': 2, 'Combination': 'Pantaprazole', 'Synonyms': 'Pantap'}]}, 'inner_hits': {'Names': {'hits': {'total': 1, 'max_score': 0.6931472, 'hits': [{'_index': '2', '_type': 'tablets', '_id': '1', '_nested': {'field': 'Names', 'offset': 1}, '_score': 0.6931472, '_source': {'ID': 2, 'Combination': 'Pantaprazole', 'Synonyms': 'Pantap'}}]}}}}]}}
failed to create query
?Elasticsearch raised an error failed to create query
because it didn't manage to create a nested query against a non-nested
field.
The field was supposed to be nested
, why wasn't it so?
There was a typo in the mapping, a comma missing. Elasticsearch failed to put the mapping. Why didn't the script fail?
Because in the Python call to es.indices.create()
the ignore="400"
parameter was set, which made the Python Elasticsearch client ignore the HTTP 400 response code, which in turn corresponds to "malformed data error".
So why did Elasticsearch allow you to do other queries, like indexing of the documents and searching?
Because by default Elasticsearch will not require mapping and will infer it from the structure of the documents. This is called dynamic mapping.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With