Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the use case for index closing in ElasticSearch?

I have found out that ES index could be closed. https://www.elastic.co/guide/en/elasticsearch/reference/6.3/indices-open-close.html

A closed index has almost no overhead on the cluster (except for maintaining its metadata), and is blocked for read/write operations.

I am trying to optimise ES for writing a lot of data, i.e. 100K messages per second. Every hour new index is created and older indexes are not used for writing anymore. However reading from older indexes is possible.

Should I close old indexes to optimise writing and the open them on demand if I need to perform search on them?

like image 995
Nikolay Kuznetsov Avatar asked Feb 04 '23 22:02

Nikolay Kuznetsov


2 Answers

If your index is closed, you obviously cannot read/search from it. Some operations, like changing index analyzers, require you to close the index before doing so and reopen it afterwards.

Other than that, if you know you'll need to read/search from your old indexes, then simply keep them open. It makes no sense to close/reopen them every time you need to read from them.

If you really want to optimize for writes, what you can do is implement hot/warm architecture and move your old indexes to warm nodes, while keeping the new one you're writing to on hot nodes.

You have a handful of other best practices you can implement if you want to optimize your indexing speed.

like image 62
Val Avatar answered Feb 15 '23 22:02

Val


The use-case for closed indices is quite niche. Besides changing some settings, like Val's answer, I don't see a wide (and correct) use of them. You might have problems when it comes to scaling the cluster (as shards of closed indices aren't moved around) or when you load N closed indices, you might end up putting too much pressure on the cluster.

In reality, the plumbing that's needed to effectively open and close indices on demand isn't justified. Or at least I didn't see one. That's why frozen indices are deprecated now.

You might consider alternatives, like force-merge, snapshot/restore or, depending on your Elasticsearch license, a searchable snapshot.

If you're using time-series data, you might also consider a hosted solution, like Sematext Logs (bias alert: I work for Sematext).

like image 37
Radu Gheorghe Avatar answered Feb 15 '23 23:02

Radu Gheorghe