Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Backing up, Deleting, Restoring Elasticsearch Indexes By Index Folder

Most of the ElasticSearch documentation discusses working with the indexes through the REST API - is there any reason I can't simply move or delete index folders from the disk?

like image 963
JasonG Avatar asked Apr 17 '15 14:04

JasonG


People also ask

Does deleting index in Elasticsearch delete data?

Yes, deleting the index, deletes all the data in that index.

How do I restore Elasticsearch backup?

Taking a snapshot is the only reliable and supported way to back up a cluster. You cannot back up an Elasticsearch cluster by making copies of the data directories of its nodes. There are no supported methods to restore any data from a filesystem-level backup.

Where are Elasticsearch indexes stored?

Indexes are stored on disk as configured in elasticsearch. yml with the configuration option path. data ; localhost on port 9200 is the default connection port for the HTTP REST interface, the path of the url generally defines an action to be taken (like searching for documents);


2 Answers

You can move data around on disk, to a point -

If Elasticsearch is running, it is never a good idea to move or delete the index folders, because Elasticsearch will not know what happened to the data, and you will get all kinds of FileNotFoundExceptions in the logs as well as indices that are red until you manually delete them.

If Elasticsearch is not running, you can move index folders to another node (for instance, if you were decomissioning a node permanently and needed to get the data off), however, if the delete or move the folder to a place where Elasticsearch cannot see it when the service is restarted, then Elasticsearch will be unhappy. This is because Elasticsearch writes what is known as the cluster state to disk, and in this cluster state the indices are recorded, so if ES starts up and expects to find index "foo", but you have deleted the "foo" index directory, the index will stay in a red state until it is deleted through the REST API.

Because of this, I would recommend that if you want to move or delete individual index folders from disk, that you use the REST API whenever possible, as it's possible to get ES into an unhappy state if you delete a folder that it expects to find an index in.

EDIT: I should mention that it's safe to copy (for backups) an indices folder, from the perspective of Elasticsearch, because it doesn't modify the contents of the folder. Sometimes people do this to perform backups outside of the snapshot & restore API.

like image 83
Lee H Avatar answered Oct 11 '22 07:10

Lee H


I use this procedure: I close, backup, then delete the indexes.

curl -XPOST "http://127.0.0.1:9200/*index_name*/_close"

After this point all index data is on disk and in a consistent state, and no writes are possible. I copy the directory where the index is stored and then delete it:

curl -XPOST "http://127.0.0.1:9200/*index_name*/_delete"

By closing the index, elasticsearch stop all access on the index. Then I send a command to delete the index (and all corresponding files on disk).

like image 40
LordTamo Avatar answered Oct 11 '22 08:10

LordTamo