Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS elasticsearch availability zone awareness and replica

I have some questions regarding AWS elasticsearch availability zone awareness and replica:

  1. To my understanding, in the event of a zone/node failure if shards were replicated between nodes the cluster will be able to completely recover and I will have a whole copy of the data in each zone, is it correct ?

  2. According to AWS elasticsearch documentation I must use Amazon elasticsearch API to replicate the data for Amazon elasticsearch cluster across the nodes in the Availability Zones. But I could not find a way to configure replicas via Amazon elasticsearch API, so I guess via Elastic API, right ?

  3. What is best practice for cluster node allocation across two Availability Zones in the same region sa-east1, how many dedicated master instances and data nodes should be enough for failover at least for the beginning of new environment ? 2 dedicated master and 2 data nodes should be enough to prevent data loss and downtime in a case of failure ? I guess 1 replica should be configure for the index. I was also thinking about 2-3 dedicated master and 3 data nodes and 2 replicas for each index.

  4. There is no settings file in AWS elasticsearch, the only way to change number of replicas is via elastic API, but I can’t find a way to change the default setting, when new index created the number of replica shards is 1, which it’s the defaults, is there a way to change the default settings for every new index ?

something like this only change the current indexes.

curl -XPUT 'https://search-aa1-a3qlyghdz2i6wszffnv4iz5cyi.sa-east-1.es.amazonaws.com/_all/_settings' -d '
{
    "index" : {
        "number_of_replicas" : 2
    }
}'

http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains.html

Thank you for any help you can provide !

like image 935
Berlin Avatar asked Nov 26 '15 16:11

Berlin


1 Answers

Here are some answers to your questions.

Full discloser first, I'm an Elastic employee and work on the Found team.

To my understanding, in the event of a zone/node failure if shards were replicated between nodes the cluster will be able to completely recover and I will have a whole copy of the data in each zone, is it correct ?

Technically yes. When using shard replicas any data written to shard 0 would be replicated to replica 0, which should be located on a secondary node in a different zone.

Be aware that AWS ES only snapshots your data once a day. You can manually take snapshots whenever you like though. In Found it's configured for every 30 minutes.

According to AWS elasticsearch documentation I must use Amazon elasticsearch API to replicate the data for Amazon elasticsearch cluster across the nodes in the Availability Zones. But I could not find a way to configure replicas via Amazon elasticsearch API, so I guess via Elastic API, right ?

Yes, I'm not 100% sure on how the AWS ES API works but the documentation suggests that all replicas have to be configured via the AWS ES API, not the ES API.

If you were manually administrating the Elasticsearch cluster, configuring replicas can be done with the ES API https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html and there are settings for zone awareness https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html.

In Found this is all configured for you when you create your cluster by indicating the region and how my Availability Zones you wish to use. Found allows you to increase or decrease the amount of Availability Zones as well directly through the console.

What is best practice for cluster node allocation across two Availability Zones in the same region sa-east1, how many dedicated master instances and data nodes should be enough for failover at least for the beginning of new environment ? 2 dedicated master and 2 data nodes should be enough to prevent data loss and downtime in a case of failure ? I guess 1 replica should be configure for the index. I was also thinking about 2-3 dedicated master and 3 data nodes and 2 replicas for each index.

Using a single master node in a 2 Availability Zone configuration would still leave you open to failure if the AZ that housed the master node failed. The AWS documentation (http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-createupdatedomains.html#es-createdomains-configure-cluster) suggests an odd number of master nodes to help avoiding split-brain and you'd want to have the load ability for your data nodes replicated. So something like 3 master nodes (1 per AZ & 1 extra to help with elections), and 4 data nodes (2 per AZ), with at least 1 replica.

Having said that, with Found this is all taken care of. For example if you setup a HA cluster in SA-East-1, Found would setup 2 data nodes in each AZ (with replicas) & the master node and election is handled by the Found infrastructure, which is also managed across both zones. This prevents split-brain due to network latency/issues & total DC failure. You can refer to https://www.elastic.co/blog/found-elasticsearch-in-production#networking for more information.

There is no settings file in AWS elasticsearch, the only way to change number of replicas is via elastic API, but I can’t find a way to change the default setting, when new index created the number of replica shards is 1, which it’s the defaults, is there a way to change the default settings for every new index ?

The default recommended by Elastic is 1 replica. To use more than 1 replica you'll want to understand why https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html#_balancing_load_with_replicas.

Index templates may help with your current situation where you can set the defaults you want in the template to be applied to any future indicies https://www.elastic.co/guide/en/elasticsearch/guide/current/index-templates.html

If you want more information on Elastic's Found offering, please visit https://www.elastic.co/found and https://www.elastic.co/found-elasticsearch-as-a-service-with-alerts

like image 149
cstrzadala Avatar answered Oct 11 '22 22:10

cstrzadala