I'm attempting to do a bulk update based on state change on a document property. Create
works fine but bulk
is freaking out. I'm getting an error to the effect of "script or doc is missing" but everything looks good.
Here is how I am attempting the bulk update:
frequency_cleared = [
{
"_id": result['_id'],
"_type": "the-type",
"_index": "the-index",
"_source": result['_source'],
"_op_type": 'update'
}
for result in search_results['hits']['hits']
]
The reason I'm iterating over my results is that I use an if in my list comprehension but since I'm able to see the results I get back I know that isn't the issue. I can't show results and had to change property names since this is for the company I work at.
Here is the traceback:
Elasticsearch.exceptions.RequestError:
TransportError(400, 'action_request_validation_exception',
'Validation Failed: 1: script or doc is missing...')
The ellipses represent it showing the same error for every element in the list fails.
How to call the Elasticsearch client’s update () method to update an index’s document. The structure of Python’s Update () method should, at the very minimum, include the index name, it’s document type (depreciated), the document ID and the content “”body”” that is being updated, as shown here:
Learn exactly how to call the bulk method with this step-by-step tutorial about Python helpers bulk load Elasticsearch. The client instance helpers.bulk ( {CLIENT_OBJ} is the first parameter you see in the code The custom iterator {ACTION_ITERATOR} gives the iteration for document bulk indexing of several documents
To confirm that Elasticsearch is running, use the requests library from Python. Documents will be bulked into an Elasticsearch index. Bring up the index list of those documents with the following cURL request. This example shows the document’s ID as a custom universally unique identifier (UUID). You can do the same thing if you import these three:
The Elasticsearch Update API is designed to update only one document at a time. However, if you wanted to make more than one call, you can make a query to get more than one document, put all of the document IDs into a Python list and iterate over that list.
It was difficult to tell based on the docs but I found out the issue. If you want to do a bulk update you need to wrap your source in a dictionary with the key being "doc". Here is the correct example, hope this helps!
frequency_cleared = [
{
'_id': result['_id'],
"_type": "the-type",
"_index": "the-index",
"_source": {'doc': result['_source']},
'_op_type': 'update'
}
for result in search_results['hits']['hits']
]
Notice the slight change is "_source" to {'doc': result['_source']}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With