Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting an ElasticSearch Cluster to Green (Cluster Setup on OS X)

I have installed ElasticSearch on Mac OS X using Homebrew. It works. The cluster started off with "green" health. However, right after adding data, it has gone to "yellow".

The cluster health is status is: green, yellow or red. On the shard level, a red status indicates that the specific shard is not allocated in the cluster, yellow means that the primary shard is allocated but replicas are not, and green means that all shards are allocated. The index level status is controlled by the worst shard status. The cluster status is controlled by the worst index status.

So, my replica shards are not allocated. How do I allocate them? (I'm thinking out loud.)

According to Shay on "I keep getting cluster health status of Yellow": "the shard allocation mechanism does not allocate a shard and its replica on the same node, though it does allocate different shards on the same node. So, you will need two nodes to get cluster state of green."

So, I need to start up a second node. I did this by:

cd ~/Library/LaunchAgents/
cp homebrew.mxcl.elasticsearch.plist homebrew.mxcl.elasticsearch-2.plist
# change line 8 to: homebrew.mxcl.elasticsearch-2
launchctl load -wF ~/Library/LaunchAgents/homebrew.mxcl.elasticsearch-2.plist

Now I have "Korvus" on http://localhost:9200/ and "Iron Monger" on http://localhost:9201/. Woot. But, I don't see any indications that they know about each other. How do I connect / introduce them to each other?

Note: I read Zen Discovery, but do not feel enlightened yet.

Update 2012-08-13 11:30 PM EST:

Here are my two nodes:

curl "http://localhost:9200/_cluster/health?pretty=true"
{
  "cluster_name" : "elasticsearch_david",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

curl "http://localhost:9201/_cluster/health?pretty=true"
{
  "cluster_name" : "elasticsearch_david",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

Update 2012-08-13 11:35 PM EST:

To clarify, my question is not how to "ignore" the problem by setting index.number_of_replicas: 0. I want multiple nodes, connected.

Update 2012-08-13 11:48 PM EST:

I just posted a double gist that includes elasticsearch.yml and elasticsearch_david.log. It looks to me like both nodes are calling themselves 'master'. Is that what I should expect?

Update 2012-08-14 12:36 AM EST:

And the novel continues! :) If I disconnect my Mac from all external networks, and then restart the nodes, then they find each other. Double Woot. This makes me think that the problem is with my network/multicast configuration. Currently I have this in my config: network.host: 127.0.0.1. Perhaps this is not correct?

like image 327
David J. Avatar asked Aug 14 '12 02:08

David J.


People also ask

How do I connect to Elasticsearch cluster?

There are two ways to connect to your Elasticsearch cluster: Through the RESTful API or through the Java transport client. Both ways use an endpoint URL that includes a port, such as https://ec47fc4d2c53414e1307e85726d4b9bb.us-east-1.aws.found.io:9243 .

Why is my cluster red Elasticsearch?

Diagnose your cluster statusedit A red status means one or more primary shards are unassigned. To view unassigned shards, use the cat shards API. Unassigned shards have a state of UNASSIGNED .

Why is Elasticsearch cluster yellow?

A yellow status means that all primary shards are allocated to nodes, but some replicas are not. A red status means at least one primary shard is not allocated to any node. A common cause of a yellow status is not having enough nodes in the cluster for the primary or replica shards.

How do you install elastic clusters?

Elasticsearch can be installed with a package manager by adding Elastic's package source list. Complete this step on all of your Elasticsearch servers. Run the following command to import the Elasticsearch public GPG key into apt: wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -


2 Answers

Solved. Do not use network.host: 127.0.0.1. Leave that line commented out to let it be automatically derived.

The default elasticsearch.yml was correct. A configuration tweak by the Homebrew installer specifies the 127.0.0.1 loopback interface:

# Set up ElasticSearch for local development:
inreplace "#{prefix}/config/elasticsearch.yml" do |s|
  # ...
  # 3. Bind to loopback IP for laptops roaming different networks
  s.gsub! /#\s*network\.host\: [^\n]+/, "network.host: 127.0.0.1"
end

I filed an issue on the Homebrew issue tracker.

like image 179
David J. Avatar answered Oct 12 '22 04:10

David J.


As you correctly noticed the cluster became yellow because you created an index with replicas but you had only one node in the cluster. One way to solve this problem is by allocating them on a second node. Another way is by turning replicas off. The number of replicas can be specified during index creation. The following command will create an new index with name new-index-name with 1 shard and no replicas.

curl -XPUT 'localhost:9200/new-index-name' -d '
{
    "settings": {
        "index" : {
            "number_of_shards" : 1,
            "number_of_replicas" : 0
        }
    }
}
'

It's also possible to change number of replicas after the index was already created using Indices Update Settings API. The following command will change number of replicas to 0 for all indices in your cluster:

curl -XPUT 'localhost:9200/_settings' -d '
{
    "index" : {
        "number_of_replicas" : 0
    }
}
'

You can verify that nodes found each other by running cluster health command:

$ curl "http://localhost:9200/_cluster/health?pretty=true"
{
  "cluster_name" : "elasticsearch",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 30,
  "active_shards" : 55,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
} 

Here the line "number_of_data_nodes" : 2, indicates that my cluster consists of two nodes, that means they found each other. You can also run Nodes Info command to see which nodes your cluster consists of:

curl "http://localhost:9200/_cluster/nodes?pretty=true"
like image 21
imotov Avatar answered Oct 12 '22 04:10

imotov