Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

deleting all documents with out dropping index in elasticsearch java API

Is there any simple Java API to delete all the documents from elastic search with out dropping the index.

I know that we could get all the ids and delete each document one by one

DeleteResponse response = _client.prepareDelete(INDEX, TYPE, id)
            .setRefresh(true)
            .execute()
            .actionGet();

But I was looking for TRUNCATE kind of scenario.

At present I am deleting the index and recreating the mapping in unit tests.

like image 306
Raghu K Nair Avatar asked Mar 03 '16 06:03

Raghu K Nair


Video Answer


2 Answers

You can use the delete-by-query plugin in order to achieve that.

You need to install it on all nodes with

sudo bin/plugin install delete-by-query

Then you can add this dependency to your pom.xml

<dependency>
    <groupId>org.elasticsearch.plugin</groupId>
    <artifactId>delete-by-query</artifactId>
    <version>2.2.0</version>
</dependency>

And finally you'll be able to use the DeleteByQueryRequestBuilder in order to delete all your documents after your tests.

like image 132
Val Avatar answered Oct 09 '22 06:10

Val


If it were me, I wouldn't try to truncate data in ES like this. Instead, I would use -0 and -1 suffixed indices and an index alias pointed at the index I considered "hot."

So for example if you have an index called my-data, I would replace that index with my-data-0 and my-data-1. Then, I would define an alias for my-data pointed at my-data-0

If I wanted to truncate my index, I'd simply swap the alias for my-data to point it at my-data-1, which would be empty, and away from my-data-0, which obviously isn't since you're trying to truncate it. After that, I would delete my-data-0 and then immediately recreate the index. Next time I need an empty index, I'd do the same thing all over again, just in reverse.

You should note that this sort of operation can be handled atomically (eg https://www.elastic.co/guide/en/elasticsearch/guide/current/index-aliases.html).

You should also note that it is gonna be much much faster this way, especially on large indices... and it'll make schema evolution a lot easier to manage as well. Please consider if that would accomplish what you need. If so, I think you'll find that much nicer to work with than the delete by query.

like image 45
Evan Volgas Avatar answered Oct 09 '22 05:10

Evan Volgas