How to keep YARN's log files?

Tags:

Suddenly, my YARN cluster has stopped working, everything I submit fails with "Exit code 1". I want to track down that problem, but as soon as an application failed, YARN deletes the log files. What is the configuration setting I have to adjust for YARN to keep these log files?

600

asked Sep 22 '15 09:09

rabejens

1 Answers

It seems your container is exiting with exit code 1.

You are unable to see the logs on the UI, because by default, the log aggregation is disabled. Following parameter determines the log aggregation: "yarn.log-aggregation-enable" (set to "false" if log aggregation is disabled).

If this is set to "false", then all the node managers store the container logs in a local directory, determined by the following configuration parameter: "yarn.nodemanager.log-dirs".

For e.g. in my case, this is set to:

  <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>e:\hdpdata\hadoop\logs</value>
  </property>

So, all my container logs for a particular application, will be found in the folder "e:\hdpdata\hadoop\logs\ {application-id} \ {container-id}", in the Node Manager machine, where the Application Master ran.

Let's assume that my application: "application_1443377528298_0010" FAILED. In the YARNRM's UI (determined by config parameter: yarn.resourcemanager.webapp.address), you can get the information about the node, on which the Application Manager ran. In the figure below, the Application Manager ran on the machine "120243". enter image description here

If you login to this machine and search in the folder "e:\hdpdata\hadoop\logs\application_1443377528298_0010\", you can see the logs for all the containers of application "application_1443377528298_0010".

But, now if you want to see the logs through YARN RM web UI, then you need to enable the log aggregation. For that, you need to set the following parameters, in yarn-site.xml:

  <property>
      <name>yarn.log-aggregation-enable</name>
      <value>true</value>
  </property>
  <property>
     <name>yarn.nodemanager.remote-app-log-dir</name>
     <value>/app-logs</value>
  </property>
  <property>
      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
      <value>logs</value>
  </property>

With the above settings, my logs are aggregated in HDFS at "/app-logs/{username}/logs/". Under this folder, you can find logs for all the applications run so far. Again the log retention is determined by the configuration parameter "yarn.log-aggregation.retain-seconds" (how long to retain the aggregated logs).

When the MapReduce applications are running, then you can access the logs from the YARN's web UI. Once the application is completed, the logs are served through Job History Server.

In your case, if you want to see the logs on the Web UI, after the application is terminated, then you need to start running the MapReduce Job History server also. To enable it, set following configuration parameters in mapred-site.xml:

  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>{job-history-hostname}:10020</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>{job-history-hostname}:19888</value>
  </property>

And set following configuration parameter in yarn-site.xml:

  <property>
    <name>yarn.log.server.url</name>
    <value>http://{job-history-hostname}:19888/jobhistory/logs</value>
  </property>

I have replicated settings from HDP installation on Windows and these settings work for me. These should work for you also. For the description of each of the configurations mentioned above, refer the links below:

https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

155

answered Sep 28 '22 04:09

Manjunath Ballur

Related questions
                            
                                Difference between Hadoop and Nosql [closed]
                            
                                Hadoop fs lookup for block size?
                            
                                Hadoop on MAC pseudo node : nodename nor servname provided, or not known
                            
                                Split size vs Block size in Hadoop
                            
                                Container killed by the ApplicationMaster Exit code is 143
                            
                                Hadoop on EC2 vs Elastic Map Reduce
                            
                                How does Apache Spark know about HDFS data nodes?
                            
                                hadoop connection refused on port 9000
                            
                                How does Hive choose the number of reducers for a job?
                            
                                Hadoop MapReduce vs MPI (vs Spark vs Mahout vs Mesos) - When to use one over the other?
                            
                                How to execute Spark programs with Dynamic Resource Allocation?
                            
                                Failed to detect a valid hadoop home directory
                            
                                How to find the most recent partition in HIVE table
                            
                                Spark without Hadoop: Failed to Launch
                            
                                How to fix Hadoop WARNING: An illegal reflective access operation has occurred error on Ubuntu
                            
                                Top N values by Hadoop Map Reduce code
                            
                                org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
                            
                                Writable and WritableComparable in Hadoop?
                            
                                Moving files in Hadoop using the Java API?
                            
                                Where is the configuration file for HDFS in Hadoop 2.2.0?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to keep YARN's log files?

Tags:

hadoop

hadoop-yarn

rabejens

People also ask

1 Answers

Manjunath Ballur

Recent Activity

Donate For Us