Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Updating indexed document in Elasticsearch

I am trying to understand how you update an indexed document in Elasticsearch. I don't understand how it works? What is the ctx that the API is referring to doing? Let say you have a document with nested documents what do you have to do to update it?

And what is the difference between deleting the document and then index the "updated" version, vs a plain update?

like image 537
LuckyLuke Avatar asked Mar 30 '13 18:03

LuckyLuke


People also ask

Can we update index in Elasticsearch?

For little changes in Index or index settings you can use update API where you can update index settings ( No of replicas, refresh interval etc.) . Also, you can update documents and add field using update API in Elasticsearch.

What is update in Elasticsearch?

In addition to being able to index and replace documents, we can also update documents. Note that Elasticsearch does not actually do in-place updates under the hood. Whenever we do an update, Elasticsearch deletes the old document and then indexes a new document with the update applied to it in one shot.


1 Answers

The update request retrieve source from Elasticsearch, modifies it and indexes it back to Elasticsearch. If you already have a copy of the document using update makes little sense. It would be generally faster to just index the new version. However, if you don't have the document readily available but you know which changes you would like to make to the document, it might be more efficient to use update.

For example, if I don't have a copy of the car document, but I want to add a new creator I can do something like this:

curl -XDELETE localhost:9200/test

curl -XPUT localhost:9200/test -d '{
    "settings": {
        "index.number_of_shards": 1,
        "index.number_of_replicas": 0
    },
    "mappings": {
        "car": {
            "properties": {
                "creators" : {
                    "type": "nested",
                    "properties": {
                        "name": {"type":"string"}
                    }
                }
            }
        }
    }
}
'

curl -XPOST localhost:9200/test/car/1 -d '{
    "creators": [{
        "name": "Steve"
    }]
}
'

echo
curl -XPOST localhost:9200/test/car/1/_update -d '{
    "script" : "ctx._source.creators += new_creator",
    "params" : {
        "new_creator" : {"name": "John"}
    }
}'

echo
curl "localhost:9200/test/car/1?pretty=true"
echo

In the update script ctx is a special variable that allows you to access the source of the object that you want to update. The ctx._source is a writable version of the source. You can modify this document in the script and the modified source will be persisted as the new version of the document.

like image 78
imotov Avatar answered Oct 05 '22 23:10

imotov