Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to setup ElasticSearch cluster with auto-scaling on Amazon EC2?

There is a great tutorial elasticsearch on ec2 about configuring ES on Amazon EC2. I studied it and applied all recommendations.

Now I have AMI and can run any number of nodes in the cluster from this AMI. Auto-discovery is configured and the nodes join the cluster as they really should.

The question is How to configure cluster in way that I can automatically launch/terminate nodes depending on cluster load?

For example I want to have only 1 node running when we don't have any load and 12 nodes running on peak load. But wait, if I terminate 11 nodes in cluster what would happen with shards and replicas? How to make sure I don't lose any data in cluster if I terminate 11 nodes out of 12 nodes?

I might want to configure S3 Gateway for this. But all the gateways except for local are deprecated.

There is an article in the manual about shards allocation. May be I'm missing something very basic but I should admit I failed to figure out if it is possible to configure one node to always hold all the shards copies. My goal is to make sure that if this would be the only node running in the cluster we still don't lose any data.

The only solution I can imagine now is to configure index to have 12 shards and 12 replicas. Then when up to 12 nodes are launched every node would have copy of every shard. But I don't like this solution cause I would have to reconfigure cluster if I might want to have more then 12 nodes on peak load.

like image 967
Victor Smirnov Avatar asked Aug 02 '13 07:08

Victor Smirnov


People also ask

How do I scale a Elasticsearch cluster?

To scale out Elasticsearch, you provision extra storage, update the deployment configuration, and then perform a helm upgrade. You can verify that the scale out was successful by checking cluster status and health, as well as node and shard health. To avoid data loss, you scale in Elasticsearch one node at a time.

Does EC2 support Auto Scaling?

Amazon EC2 Auto Scaling can automatically balance instances across zones, and always launches new instances so that they are balanced between zones as evenly as possible across your entire fleet.


2 Answers

Auto scaling doesn't make a lot of sense with ElasticSearch.

Shard moving and re-allocation is not a light process, especially if you have a lot of data. It stresses IO and network, and can degrade the performance of ElasticSearch badly. (If you want to limit the effect you should throttle cluster recovery using settings like cluster.routing.allocation.cluster_concurrent_rebalance, indices.recovery.concurrent_streams, indices.recovery.max_size_per_sec . This will limit the impact but will also slow the re-balancing and recovery).

Also, if you care about your data you don't want to have only 1 node ever. You need your data to be replicated, so you will need at least 2 nodes (or more if you feel safer with a higher replication level).

Another thing to remember is that while you can change the number of replicas, you can't change the number of shards. This is configured when you create your index and cannot be changed (if you want more shards you need to create another index and reindex all your data). So your number of shards should take into account the data size and the cluster size, considering the higher number of nodes you want but also your minimal setup (can fewer nodes hold all the shards and serve the estimated traffic?).

So theoretically, if you want to have 2 nodes at low time and 12 nodes on peak, you can set your index to have 6 shards with 1 replica. So on low times you have 2 nodes that hold 6 shards each, and on peak you have 12 nodes that hold 1 shard each.

But again, I strongly suggest rethinking this and testing the impact of shard moving on your cluster performance.

like image 162
Rotem Hermon Avatar answered Sep 26 '22 07:09

Rotem Hermon


In cases where the elasticity of your application is driven by a variable query load you could setup ES nodes configured to not store any data (node.data = false, http.enabled = true) and then put them in for auto scaling. These nodes could offload all the HTTP and result conflation processing from your main data nodes (freeing them up for more indexing and searching).

Since these nodes wouldn't have shards allocated to them bringing them up and down dynamically shouldn't be a problem and the auto-discovery should allow them to join the cluster.

like image 12
edparris Avatar answered Sep 27 '22 07:09

edparris