I have a use case in which concurrent update requests make hit my Elasticsearch cluster. In order to make sure that a stale event (one that is made irrelevant by a newer request) does not update a document after a newer event has already reached the cluster, I would like to pass a script with my update requests to compare a field to determine if the incoming request is relevant or not. The request would look like this:
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '
{
"script": " IF ctx._source.user_update_time > my_new_time THEN do not update ELSE proceed with update",
"params": {
"my_new_time": "2014-09-01T17:36:17.517""
},
"doc": {
"name": "new_name"
},
"doc_as_upsert": true
}'
Is the pseudo code I wrote in the "script" field possible in Elasticsearch ? If so, I would love some help with the syntax (groovy, python or javascript).
Any alternative approach suggestions would be greatly appreciated too.
Elasticsearch has built-in optimistic concurrency control (+ here and here).
The way it works is that the Update API allows you two use the version
parameter in order to control whether the update should proceed or not.
So taking your above example, the first index/update operation would create a document with version: 1
. Then take the case where you have two concurrent requests. Both components A and B will send an updated document, they initially have both retrieved the document with version: 1
and will specify that version in their request (see version=1
in the query string below). Elasticsearch will update the document if and only if the provided version is the same as the current one
Component A and B both send this, but A's request is the first to make it:
curl -XPOST 'localhost:9200/test/type1/1/_update?version=1' -d '{
"doc": {
"name": "new_name"
},
"doc_as_upsert": true
}'
At this point the version of the document will be 2 and B's request will end up with HTTP 409 Conflict
, because B assumed the document was still at version 1, even though the version increased in the meantime due to A's request.
B can definitely retrieve the document with the new version (i.e. 2) and try its update again, but this time with ?version=2
in the URL. If it's the first one to reach ES, the update will succeed.
I think the script should be like this:
"script": "if(ctx._source.user_update_time > my_new_time) ctx._source.user_update_time=my_new_time;"
or
"script": "ctx._source.user_update_time > my_new_time ? ctx.op=\"none\" : ctx._source.user_update_time=my_new_time"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With