Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Primary shard is not active or isn't assigned is a known node ?

I am running an elastic search version 4.1 on windows 8. I tried to index a document through java. When running a JUNIT test the error appears as below.

org.elasticsearch.action.UnavailableShardsException: [wms][3] Primary shard is not active or isn't assigned is a known node. Timeout: [1m], request: index {[wms][video][AUpdb-bMQ3rfSDgdctGY], source[{
    "fleetNumber": "45",
    "timestamp": "1245657888",
    "geoTag": "73.0012312,-123.00909",
    "videoName": "timestamp.mjpeg",
    "content": "ASD123124NMMM"
}]}
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.doStart(TransportShardReplicationOperationAction.java:402)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:500)
    at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:239)
    at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:497)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)

I can not figure out, why causes this error to happen. When a delete data or index it works fine. What might be the possible cause of it.

like image 947
Prem Singh Bist Avatar asked Dec 18 '14 12:12

Prem Singh Bist


2 Answers

you should look at that link: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html

and that part in particular:

cluster.routing.allocation.disk.watermark.low controls the low watermark for disk usage. It defaults to 85%, meaning ES will not allocate new shards to nodes once they have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available.

cluster.routing.allocation.disk.watermark.high controls the high watermark. It defaults to 90%, meaning ES will attempt to relocate shards to another node if the node disk usage rises above 90%. It can also be set to an absolute byte value (similar to the low watermark) to relocate shards once less than the configured amount of space is available on the node.

like image 116
Alexandre Mélard Avatar answered Oct 05 '22 23:10

Alexandre Mélard


The Problem: seems that elasticsearch stops sending data to kibana as the disk space is exceeded. You get org.elasticsearch.action.UnavailableShardsException and timeout based on the fact that your primary shard is not active. To strengthen the theory - run sudo df -h and You'll probably might get high percentages of data volumes from /var/data in your machine.

Explanation: according to documentation on elasticserach disk space shard allocation, Elasticsearch considers the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node. You have 4 variables that need to be set in order to override the default disk space shard allocation

1.cluster.routing.allocation.disk.threshold_enabled Defaults to true. Set to false to disable the disk allocation decider. 2.cluster.routing.allocation.disk.watermark.low Controls the low watermark for disk usage. It defaults to 85%, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices but will prevent their replicas from being allocated.

3.cluster.routing.allocation.disk.watermark.high Controls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.

4.cluster.routing.allocation.disk.watermark.flood_stage Controls the flood stage watermark. It defaults to 95%, meaning that Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node that has at least one disk exceeding the flood stage. This is a last resort to prevent nodes from running out of disk space. The index block is automatically released once the disk utilization falls below the high watermark.

Solution: Now lets perform an api call ,edit the configuration ,and increase the disk space shard allocation limitation (from 90 defaults to 95%-97%):

 curl -XPUT -H 'Content-Type: application/json' 'localhost:9200/_cluster/settings' 
-d '{  "transient":{
 "cluster.routing.allocation.disk.watermark.low":"95%",
"cluster.routing.allocation.disk.watermark.high": "97%",
"cluster.routing.allocation.disk.watermark.flood_stage": "98%",
"cluster.info.update.interval": "1m"
}}'
like image 32
avivamg Avatar answered Oct 05 '22 23:10

avivamg