I’m currently using Elasticsearch V2.3.1. I want to use the following Elasticsearch query in Java.
POST /twitter/_update_by_query
{
"script": {
"inline": "ctx._source.List = [‘Item 1’,’Item 2’]”
},
"query": {
"term": {
"user": "kimchy"
}
}
}
The above query searches for “user” named “kimchy” and updates the “List” field with given values. This query updates multiple documents at the same time. I read about the Update API for Java here https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.3/java-docs-update.html but couldn’t find what I was looking for. The Update API for Java only talks about updating single document at a time. Is there any way to update multiple documents? Sorry if I’m missing something obvious. Thank you for your time.
Update:
I tried the below Java Code:
Client client = TransportClient.builder().addPlugin(ReindexPlugin.class)
.build().addTransportAddress(new InetSocketTransportAddress(
InetAddress.getByName("127.0.0.1"), 9300));
UpdateByQueryRequestBuilder ubqrb = UpdateByQueryAction.INSTANCE
.newRequestBuilder(client);
Script script = new Script("ctx._source.List = [\"Item 1\",\"Item 2\"]");
//termQuery is not recognised by the program
BulkIndexByScrollResponse r = ubqrb.source("twitter").script(script)
.filter(termQuery("user", "kimchy")).execute().get();
So I edited the Java Program as above and the termQuery is not identified by Java. May I know what I'm doing wrong here? Thanks.
Note that Elasticsearch does not actually do in-place updates under the hood. Whenever we do an update, Elasticsearch deletes the old document and then indexes a new document with the update applied to it in one shot. In the above example, ctx. _source refers to the current source document that is about to be updated.
Elasticsearch has an Update API that can be used to process updates and deletes. The Update API reduces the number of network trips and potential for version conflicts. The Update API retrieves the existing document from the index, processes the change and then indexes the data again.
Elasticsearch is a NoSQL Database, which is developed in Java programming language. It is a real-time, distributed, and analysis engine that is designed for storing logs. It is a highly scalable document storage engine. Similar to the MongoDB, it stores the data in document format.
As of ES 2.3, the update by query feature is available as the REST endpoint _update_by_query
but nor for Java clients. In order to call this endpoint from your Java client code, you need to include the reindex
module in your pom.xml, like this
<dependency>
<groupId>org.elasticsearch.module</groupId>
<artifactId>reindex</artifactId>
<version>2.3.2</version>
</dependency>
Then you need to include this module when building your client:
clientBuilder.addPlugin(ReindexPlugin.class);
Finally you can call it like this:
UpdateByQueryRequestBuilder ubqrb = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
Script script = new Script("ctx._source.List = [\"Item 1\",\"Item 2\"]");
BulkIndexByScrollResponse r = ubqrb.source("twitter")
.script(script)
.filter(termQuery("user", "kimchy"))
.get();
UPDATE
If you need to specify the type(s) the update should focus on, you can do so:
ubqrb.source("twitter").source().setTypes("type1");
BulkIndexByScrollResponse r = ubqrb.script(script)
.filter(termQuery("user", "kimchy"))
.get();
In ES 7.9 this also works using UpdateByQueryRequest
Map<String, Object> map = new HashMap<String, Object>(); UpdateByQueryRequest updateByQueryRequest = new UpdateByQueryRequest("indexName"); updateByQueryRequest.setConflicts("proceed"); updateByQueryRequest.setQuery(new TermQueryBuilder("_id", documentId)); Script script = new Script(ScriptType.INLINE, "painless", "ctx._source = params", map); updateByQueryRequest.setScript(script);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With