Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch - Determining maximum shard size

Hopefully this question isn't out of date, but I haven't found a clear answer anywhere yet. According to one of the ES presentations from last year (http://www.elasticsearch.org/videos/big-data-search-and-analytics/), there's a "maximum" size for a shard. I'm trying to determine this for my application, but as far as I can tell, I haven't hit it yet. Does anyone know what the behavior of a single-shard index that's reached its maximum? Do inserts fail, or is it just that the index becomes unusable?

like image 669
coug_ Avatar asked Jun 06 '13 14:06

coug_


1 Answers

To test this myself, I indexed all the English articles in Wikipedia (without any history information) in a single elasticsearch shard. The elasticsearch data folder grew to ~42GB at the end of the test. Lessons learned are:

  • indexing speed will not be affected by the size of the shard. Mind you, I did not try indexing with more than one thread at a time, but single thread indexing speed was more or less constant for the duration of the test
  • querying speed on the other hand was drastically affected by shard size. Especially once you try to query with more than one user at a time. The exact numbers will depend heavily on the power of your machine, data structure and how many threads are querying. To give you an idea, with elasticsearch running on my dev machine, querying the Wikipedia shard with 25 concurrent users resulted in an average response time of 3.5 seconds (with peaks towards half a minute).

My conclusion is that a shard too large will not make elasticsearch fail just from indexing. Querying the large shard may be too slow for your needs, or, in certain situations, even break elasticsearch with an OutOfMemoryException (for example a big faceted query).

This answer is based on my own investigation. Full story can be read on my blog:

http://blog.trifork.com/2013/09/26/maximum-shard-size-in-elasticsearch/
http://blog.trifork.com/2013/11/05/maximum-shard-size-in-elasticsearch-revisited/

like image 141
bogdan Avatar answered Nov 07 '22 11:11

bogdan