YARN UNHEALTHY nodes

Tags:

In our YARN cluster which is 80% full, we are seeing some of the yarn nodemanager's are marked as UNHEALTHY. after digging into logs I found its because disk space is 90% full for data dir. With following error

2015-02-21 08:33:51,590 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node hdp009.abc.com:8041 reported UNHEALTHY with details: 4/4 local-dirs are bad: /data3/yarn/nm,/data2/yarn/nm,/data4/yarn/nm,/data1/yarn/nm;
2015-02-21 08:33:51,590 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: hdp009.abc.com:8041 Node Transitioned from RUNNING to UNHEALTHY

I am trying to understand how yarn marks node Unhealthy & is there any way to change the threshold ?

Thanks

285

asked Mar 12 '15 12:03

roy

1 Answers

try adding the property yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage to yarn-site.xml. This property specifies the maximum percentage of disk space utilization allowed after which a disk is marked as bad. Values can range from 0.0 to 100.0.

yarn-default.xml

force to health state e.g.:

<?xml version="1.0"?>
<configuration>    
  <property>
     <name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
     <value>0.0</value>
  </property>
  <property>
     <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
     <value>100.0</value>
  </property>
</configuration>

150

answered Nov 01 '22 05:11

Hamza Zafar

Related questions
                            
                                Why YARN java heap space memory error?
                            
                                Hive Internal Error: java.lang.ClassNotFoundException(org.apache.atlas.hive.hook.HiveHook)
                            
                                Running yarn with spark not working with Java 8
                            
                                Hive join set number of reducers
                            
                                Hadoop: job runs okay on smaller set of data but fails with large dataset
                            
                                More than 120 counters in hadoop
                            
                                Compute differences between succesive records in Hadoop with Hive Queries
                            
                                Convert string to timestamp in Hive
                            
                                Could not find or load main class when trying to format namenode; hadoop installation on MAC OS X 10.9.2
                            
                                How to install RHadoop packages (Rmr, Rhdfs, Rhbase)?
                            
                                How to access hdfs by URI consisting of H/A namenodes in Spark which is outer hadoop cluster?
                            
                                How to extract selected values from json string in Hive
                            
                                hadoop aws versions compatibility
                            
                                Max/Min for whole sets of records in PIG
                            
                                Storing results of UNION in PIG in a single file
                            
                                Difference between PIG local and mapreduce mode
                            
                                YarnException: Unauthorized request to start container
                            
                                Which nodejs library should I use to write into HDFS?
                            
                                wiping out the Zookeeper data directory
                            
                                Can I cluster by/bucket a table created via "CREATE TABLE AS SELECT....." in Hive?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

YARN UNHEALTHY nodes

Tags:

distributed-computing

hadoop

hadoop-yarn

cloudera

cloudera-cdh

roy

People also ask

1 Answers

Hamza Zafar

Recent Activity

Donate For Us