Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if ElasticSearch index exists and is ready?

How can I check if a index exists AND is ready for use in ElasticSearch?

We currently check if "indexA" exists by running a query which selects some documents. If the query don't return any hits we assume that "indexA" don't exist, and create the index (fresh install). The problem is that our application starts faster than ElasticSearch when the server reboots, and we get two duplicate "indexA" because the search for doucments in "indexA" fails when ElasticSearch is starting up.. (I guess the index is not ready yet)

There is a method: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-exists.html - is this guaranteed to return "true" for "indexA", even when ES is starting up and the index is not ready yet?

Or should I use the "status"-method, specify indexname, and check if all shards have status "STARTED"?

Or should I use this: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-recovery.html Can ElasticSearch go into "recovery-mode"? When does this happen and how should we handle it?

Or should I look into "CatHealth"? .Epoc?

like image 374
Thomas Avatar asked Mar 15 '23 23:03

Thomas


1 Answers

When you say ready, do you just mean ready to start searching?

Reading into your question, it sounds like you'd like to know the status of the cluster, which you can do with the Cluster Health API

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true

You can pass a query string param wait_for_status=green that will wait until the cluster is in the given status (or until the timeout expires, 30 seconds by default).

Based on your comments in the question, the cluster is in yellow status because there are 5 unassigned shards; when running with the default configuration, Elasticsearch creates 5 primary shards and 1 replica (i.e. a replica shard for each primary shard). Since there is only one node in the cluster, the replica shards will remain unassigned as Elasticsearch will not locate them on the same node that contains all the primary shards as this will not provide any redundancy. Adding another node to the cluster will cause Elasticsearch to relocate 2 of the primary shards onto the new node along with 3 replicas, and 2 replicas onto the original node. With this distribution, a node can go down but no data will be lost. Adding another node will change the status to green, although you will be able to use the cluster in yellow status.

If you're going to be using this cluster in production, I strongly recommend having at least 2 nodes (ideally on separate machines) so you have at least one replica.

like image 134
Russ Cam Avatar answered Mar 27 '23 05:03

Russ Cam