Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many simultaneous requests can I send to ElasticSearch cluster?

I want to send multiple bulk operation requests to ElasticSearch cluster, and I come across this issue EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

I have a cluster of 4 ElasticSearch instances (version 1.3.4), when I sent this request to get the number of its bulk operation thread pool size:

GET /_cat/thread_pool?v&h=host,bulk.active,bulk.queueSize

I got

host    bulk.active bulk.queueSize
1D4HPY1           0             50 
1D4HPY2           0             50
1D4HPY3           0             50 
1D4HPY4           0             50

So how many simultaneous bulk operation requests I can send to that cluster? 50 or 200?

like image 647
Truong Ha Avatar asked Oct 20 '14 14:10

Truong Ha


People also ask

How many nodes should an Elasticsearch cluster have?

A setting of 1 will allow your cluster to function but doesn't protect against the split-brain. It is best to have a minimum of three nodes.

How many primary shards can exist in a cluster?

A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health.

How much data can Elasticsearch hold?

There are no hard limits on shard size, but experience shows that shards between 10GB and 50GB typically work well for logs and time series data. You may be able to use larger shards depending on your network and use case. Smaller shards may be appropriate for Enterprise Search and similar use cases.

How do I increase queue size in Elasticsearch?

To change the queue size one could add it in the config file for each of the nodes as follows: threadpool.search. queue_size: <new queue size> . However this would also require a cluster restart.


1 Answers

I would suggest having a look at this section from the documentation.

Also, you need to be more specific when you say "simultaneous requests that you can send" because, as you see in the page above, there are various thread pools that handle various jobs. You give an example in your post for "bulk" operations.

In my opinion, the correct request for "bulk" to see the number of simultaneous running threads (as per this piece of documentation) is GET /_cat/thread_pool?v&h=host,bulk.queueSize,bulk.min,bulk.max. So, you have bulk.max active threads allowed in the thread pool with a bulk.queueSize number of tasks in the queue for it. When a request comes in and there are no threads to handle it, the request is put in queue to wait.

like image 109
Andrei Stefan Avatar answered Sep 28 '22 07:09

Andrei Stefan