Elasticsearch hot-backup strategies

Tags:

It would be interesting if someone could share his best 'hot-backup' strategies for ElasticSearch.

Also, feel free to share tools and libraries related to this problem and can help.

Updated: Thank you @javanna for your response, it's quite complete and provides good direction for further actions.

I also did a small research and found some articles/discussions which can help if somebody has an interest.

Elasticsearch backup strategies
Backup/restore Elasticsearch index and related snippet on github:gist
Elastic Search Backup and Recovery discussion (check the comment of Paul Smith, also he shared a usefull link to his tool for verifying indexes )

Update: Elasticsearch 1.0 have an "official" backup solution - Snapshot/Restore API and this is the only right way to it now. ElasticSearch will identify master shards and take care about consistency. The backup is going to be done incrementally, so you will be able to do it very fast and as often as you want.

452

asked Oct 11 '12 09:10

gakhov

1 Answers

Replicas are a sort of backup, and elasticsearch never allocates one on the same node where the original primary shard is. But then there is still the risk of losing data depending on how many shards, replicas and nodes you have in your cluster.

I would look at the Gateway module, through which you can save the index and the cluster metadata. There are different type of gateway. I'd look at the Shared FS for example, which allows you to copy the index and the metadata to a file system which is shared between all your nodes. You can also manually start a snapshot through the Gateway Snapshot API.

Also, you can make a copy of the data directory (on every node) once you disabled flush through the index.translog.disable_flush index setting. That way you would make sure that no lucene commit will be issued while you're copying. After you made the copy you need to enable flush again.

UPDATE

All the gateway types except for the local one have been deprecated and will be removed in a future version. Elasticsearch 1.0 will be released with a better backup solution.

116

answered Oct 12 '22 14:10

javanna

Related questions
                            
                                git log without tags
                            
                                Why do I have a lock here?
                            
                                Delegating to the default move constructor
                            
                                Equivalent of Eclipse's "Problems" view on Android Studio [duplicate]
                            
                                UIPageViewController vs UIViewControllerInteractiveTransitioning
                            
                                Relative Paths From Partials Referencing Other Partials
                            
                                How is SoundCloud embedding an HTML5/JS widget on Facebook?
                            
                                Android Google drive App data folder return empty when I use queryChildren
                            
                                Is it required to explicitly list default parameters when using template template parameter?
                            
                                WebView Crash by java.io.IOException: close failed: EIO (I/O error) libcore.io.IoUtils.close(IoUtils.java:41)
                            
                                Mark as failed running too long JUnit tests
                            
                                How can I run an Apache Spark shell remotely?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With