Hopefully this question isn't out of date, but I haven't found a clear answer anywhere yet. According to one of the ES presentations from last year (http://www.elasticsearch.org/videos/big-data-search-and-analytics/), there's a "maximum" size for a shard. I'm trying to determine this for my application, but as far as I can tell, I haven't hit it yet. Does anyone know what the behavior of a single-shard index that's reached its maximum? Do inserts fail, or is it just that the index becomes unusable?
To test this myself, I indexed all the English articles in Wikipedia (without any history information) in a single elasticsearch shard. The elasticsearch data folder grew to ~42GB at the end of the test. Lessons learned are:
My conclusion is that a shard too large will not make elasticsearch fail just from indexing. Querying the large shard may be too slow for your needs, or, in certain situations, even break elasticsearch with an OutOfMemoryException (for example a big faceted query).
This answer is based on my own investigation. Full story can be read on my blog:
http://blog.trifork.com/2013/09/26/maximum-shard-size-in-elasticsearch/
http://blog.trifork.com/2013/11/05/maximum-shard-size-in-elasticsearch-revisited/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With