I have a 40 node Elasticsearch cluster which is hammered by a high index request rate. Each of these nodes makes use of an SSD for the best performance. As suggested from several sources, I have tried to prevent index throttling with the following configuration:
indices.store.throttle.type: none
Unfortunately, I'm still seeing performance issues as the cluster still periodically throttles indices. This is confirmed by the following logs:
[2015-03-13 00:03:12,803][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphonaudit_20150313][19] now throttling indexing: numMergesInFlight=6, maxNumMerges=5
[2015-03-13 00:03:12,829][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphonaudit_20150313][19] stop throttling indexing: numMergesInFlight=4, maxNumMerges=5
[2015-03-13 00:03:13,804][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphonaudit_20150313][19] now throttling indexing: numMergesInFlight=6, maxNumMerges=5
[2015-03-13 00:03:13,818][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphonaudit_20150313][19] stop throttling indexing: numMergesInFlight=4, maxNumMerges=5
[2015-03-13 00:05:00,791][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphon_20150313][6] now throttling indexing: numMergesInFlight=6, maxNumMerges=5
[2015-03-13 00:05:00,808][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphon_20150313][6] stop throttling indexing: numMergesInFlight=4, maxNumMerges=5
[2015-03-13 00:06:00,861][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphon_20150313][6] now throttling indexing: numMergesInFlight=6, maxNumMerges=5
[2015-03-13 00:06:00,879][INFO ][index.engine.internal    ] [CO3SCH010160941] [siphon_20150313][6] stop throttling indexing: numMergesInFlight=4, maxNumMerges=5
The throttling occurs after one of the 40 nodes dies for various expected reasons. The cluster immediately enters a yellow state, in which a number of shards will begin initializing on the remaining nodes.
Any idea why the cluster continues to throttle after explicitly configuring it not to? Any other suggestions to have the cluster more quickly return to a green state after a node failure?
The purpose of throttling is to prevent any action to be executed too many times, thus generating too many notifications.
If you're using time-based index names, for example daily indices for logging, and you don't have enough data, a good way to reduce the number of shards would be to switch to a weekly or a monthly pattern. You can also group old read-only indices., by month, quarter or year.
A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health.
The setting that actually corresponds to the maxNumMerges in the log file is called index.merge.scheduler.max_merge_count. Increasing this along with index.merge.scheduler.max_thread_count (where max_thread_count <= max_merge_count) will increase the number of simultaneous merges which are allowed for segments within an individual index's shards.
If you have a very high indexing rate that results in many GBs in a single index, you probably want to raise some of the other assumptions that the Elasticsearch default settings make about segment size, too. Try raising the floor_segment - the minimum size before a segment will be considered for merging, the max_merged_segment - the maximum size of a single segment, and the segments_per_tier -- the number of segments of roughly equivalent size before they start getting merged into a new tier. On an application that has a high indexing rate and finished index sizes of roughly 120GB with 10 shards per index, we use the following settings:
curl -XPUT /index_name/_settings -d'
{
  "settings": {
    "index.merge.policy.max_merge_at_once": 10,
    "index.merge.scheduler.max_thread_count": 10,
    "index.merge.scheduler.max_merge_count": 10,
    "index.merge.policy.floor_segment": "100mb",
    "index.merge.policy.segments_per_tier": 25,
    "index.merge.policy.max_merged_segment": "10gb"
  }
}
Also, one important thing you can do to improve loss-of-node/node restarted recovery time on applications with high indexing rates is taking advantage of index recovery prioritization (in ES >= 1.7). Tune this setting so that the indices that receive the most indexing activity are recovered first. As you may know, the "normal" shard initialization process just copies the already-indexed segment files between nodes. However, if indexing activity is occurring against a shard before or during initialization, the translog with the new documents can become very large. In the scenario where merging goes through the roof during recovery, it's the replay of this translog against the shard that is almost always the culprit. Thus, using index recovery prioritization to recover those shards first and delay shards with less indexing activity, you can minimize the eventual size of the translog which will dramatically improve recovery time.
We are using 1.7 and noticed a similar problem: The indexing getting throttled even when the IO was not saturated (Fusion IO in our case).
After increasing "index.merge.scheduler.max_thread_count" the problem seems to be gone -- we did not see any more throttling being logged so far.
I would try setting "index.merge.scheduler.max_thread_count" to at least the max reported numMergesInFlight (6 in the logs above).
https://www.elastic.co/guide/en/elasticsearch/reference/1.7/index-modules-merge.html#scheduling
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With