I have a very simple question :
I want to update multiple documents to elasticsearch. Sometimes the document already exists but sometimes not. I don't want to use a get request to check the existence of the document (this is decreasing my performance). I want to use directly my update request to index the document directly if it doesn't exist yet.
I know that we can use upsert to create a non existing field when updating a document, but this is not what I want. I want to index the document if it doesn't exist. I don't know if upsert can do this.
Can you provide me some explaination ?
Thanks in advance!
The Update API lets your client applications download hashed versions of the Web Risk lists for storage in a local or in-memory database. URLs can then be checked locally.
In Elasticsearch, to replace a document you simply have to index a document with the same ID and it will be replaced automatically. If you would like to update a document you can either do a scripted update, a partial update or both.
Elasticsearch allows us to do partial updates, but internally these are “get_then_update” operations, where the whole document is fetched, the changes are applied and then the document is indexed again. Even without disk hits one can imagine the potential performance implications if this is your main use case.
doc_as_upsert is used when you're updating using a partial doc. it saves. you from passing the doc twice: one as doc and once as upsert. clint.
This is doable using the update api. It does require that you define the id of each document, since the update api requires the id of the document to determine its presence.
Given an index created with the following documents:
PUT /cars/car/1 { "color": "blue", "brand": "mercedes" } PUT /cars/car/2 { "color": "blue", "brand": "toyota" }
We can get the upsert functionality you want using the update api with the following api call.
POST /cars/car/3/_update { "doc": { "color" : "brown", "brand" : "ford" }, "doc_as_upsert" : true }
This api call will add the document to the index since it does not exist.
Running the call a second time after changing the color of the car, will update the document, instead of creating a new document.
POST /cars/car/3/_update { "doc": { "color" : "black", "brand" : "ford" }, "doc_as_upsert" : true }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With