Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get a yarn configuration from commandline

In EMR, is there a way to get a specific value of the configuration given the configuration key using the yarn command?

For example I would like to do something like this

yarn get-config yarn.scheduler.maximum-allocation-mb
like image 859
fo_x86 Avatar asked Jan 07 '16 22:01

fo_x86


People also ask

How do you find the yarn configuration?

The yarn config command helps you to manage yarn configuration files. You can set a file, read it, update it or even delete it. The config key will be set to a certain value when you run this command.

What is yarn command-line?

yarn add: the yarn add command is a command you run in your terminal when you want to add a package to your current package (project) yarn init: we used this command in our tutorial on getting started, this command is to be run in your terminal. It will initialize the development of a package.

What is yarn dev command?

yarn install is used to install all dependencies for a project. This is most commonly used when you have just checked out code for a project, or when another developer on the project has added a new dependency that you need to pick up. If you are used to using npm you might be expecting to use --save or --save-dev .


1 Answers

It's a bit non-intuitive, but it turns out the hdfs getconf command is capable of checking configuration properties for YARN and MapReduce, not only HDFS.

> hdfs getconf -confKey fs.defaultFS
hdfs://localhost:19000

> hdfs getconf -confKey dfs.namenode.name.dir
file:///Users/chris/hadoop-deploy-trunk/data/dfs/name

> hdfs getconf -confKey yarn.resourcemanager.address
0.0.0.0:8032

> hdfs getconf -confKey mapreduce.framework.name
yarn

A benefit of using this is that you'll see the actual, final results of any configuration properties as they are actually used by Hadoop. This would account for some of the more advanced configuration patterns, such as use of XInclude in the XML files or property substitutions, like this:

  <property>
    <description>The address of the applications manager interface in the RM.</description>
    <name>yarn.resourcemanager.address</name>
    <value>${yarn.resourcemanager.hostname}:8032</value>
  </property>

Any scripting approach that tries to parse the XML files directly is unlikely to accurately match the implementation as its done inside Hadoop, so it's better to ask Hadoop itself.

You might be wondering why an hdfs command can get configuration properties for YARN and MapReduce. Great question! It's somewhat of a coincidence of the implementation needing to inject an instance of MapReduce's JobConf into some objects created via reflection. The relevant code is visible here:

https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java#L82-L114

This code is executed as part of running the hdfs getconf command. By triggering a reference to JobConf, it forces class loading and static initialization of the relevant MapReduce and YARN classes that add yarn-default.xml, yarn-site.xml, mapred-default.xml and mapred-site.xml to the set of configuration files in effect.

Since it's a coincidence of the implementation, it's possible that some of this behavior will change in future versions, but it would be a backwards-incompatible change, so we definitely wouldn't change that behavior inside the current Hadoop 2.x line. The Apache Hadoop Compatibility policy commits to backwards-compatibility within a major version line, so you can trust that this will continue working at least within the 2.x version line.

like image 193
Chris Nauroth Avatar answered Sep 28 '22 09:09

Chris Nauroth