Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bulk update with Python's elasticsearch client

I'm attempting to do a bulk update based on state change on a document property. Create works fine but bulk is freaking out. I'm getting an error to the effect of "script or doc is missing" but everything looks good.

Here is how I am attempting the bulk update:

frequency_cleared = [
    {
        "_id": result['_id'], 
        "_type": "the-type", 
        "_index": "the-index", 
        "_source": result['_source'],
        "_op_type": 'update'
    } 
    for result in search_results['hits']['hits']
]

The reason I'm iterating over my results is that I use an if in my list comprehension but since I'm able to see the results I get back I know that isn't the issue. I can't show results and had to change property names since this is for the company I work at.

Here is the traceback:

Elasticsearch.exceptions.RequestError: 
TransportError(400, 'action_request_validation_exception',
  'Validation Failed: 1: script or doc is missing...') 

The ellipses represent it showing the same error for every element in the list fails.

like image 534
Obj3ctiv3_C_88 Avatar asked Feb 03 '16 16:02

Obj3ctiv3_C_88


People also ask

How to call the Elasticsearch client’s update () method in Python?

How to call the Elasticsearch client’s update () method to update an index’s document. The structure of Python’s Update () method should, at the very minimum, include the index name, it’s document type (depreciated), the document ID and the content “”body”” that is being updated, as shown here:

How to bulk load Elasticsearch documents using Python helpers?

Learn exactly how to call the bulk method with this step-by-step tutorial about Python helpers bulk load Elasticsearch. The client instance helpers.bulk ( {CLIENT_OBJ} is the first parameter you see in the code The custom iterator {ACTION_ITERATOR} gives the iteration for document bulk indexing of several documents

How do I know if Elasticsearch is running in Python?

To confirm that Elasticsearch is running, use the requests library from Python. Documents will be bulked into an Elasticsearch index. Bring up the index list of those documents with the following cURL request. This example shows the document’s ID as a custom universally unique identifier (UUID). You can do the same thing if you import these three:

How do I update more than one document in Elasticsearch?

The Elasticsearch Update API is designed to update only one document at a time. However, if you wanted to make more than one call, you can make a query to get more than one document, put all of the document IDs into a Python list and iterate over that list.


1 Answers

It was difficult to tell based on the docs but I found out the issue. If you want to do a bulk update you need to wrap your source in a dictionary with the key being "doc". Here is the correct example, hope this helps!

frequency_cleared = [
    {
        '_id': result['_id'], 
        "_type": "the-type", 
        "_index": "the-index", 
        "_source": {'doc': result['_source']}, 
        '_op_type': 'update'
    } 
    for result in search_results['hits']['hits']
]

Notice the slight change is "_source" to {'doc': result['_source']}

like image 160
Obj3ctiv3_C_88 Avatar answered Sep 28 '22 09:09

Obj3ctiv3_C_88