Hadoop Yarn: How to limit dynamic self allocation of resources with Spark?

Tags:

In our Hadoop cluster that runs under Yarn we are having a problem that some "smarter" people are able to eat significantly larger chunks of resources by configuring Spark jobs in pySpark Jupyter notebooks like:

conf = (SparkConf()
        .setAppName("name")
        .setMaster("yarn-client")
        .set("spark.executor.instances", "1000")
        .set("spark.executor.memory", "64g")
        )

sc = SparkContext(conf=conf)

This leads to the situation when these people literally squeeze out others less "smarter".

Is there a way to forbid users to self-allocate resources and leave resource allocation solely to Yarn?

651

asked Oct 12 '16 12:10

Sergey Bushmanov

1 Answers

YARN have very good support for capacity planning in Multi-tenancy cluster by queues, YARN ResourceManager uses CapacityScheduler by default .

Here we are taking queue name as alpha in spark submit for demo purpose.

$ ./bin/spark-submit --class path/to/class/file \
    --master yarn-cluster \
    --queue alpha \
    jar/location \
    args

Setup the queues:

CapacityScheduler has a predefined queue called root. All queues in the system are children of the root queue. In capacity-scheduler.xml, parameter yarn.scheduler.capacity.root.queues is used to define the child queues;

for example, to create 3 queues, specify the name of the queues in a comma separated list.

<property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>alpha,beta,default</value>
    <description>The queues at the this level (root is the root queue).</description>
</property>

These are few important properties to consider for capacity planning.

<property>
    <name>yarn.scheduler.capacity.root.alpha.capacity</name>
    <value>50</value>
    <description>Queue capacity in percentage (%) as a float (e.g. 12.5). The sum of capacities for all queues, at each level, must be equal to 100. Applications in the queue may consume more resources than the queue’s capacity if there are free resources, providing elasticity.</description>
</property>

<property>
    <name>yarn.scheduler.capacity.root.alpha.maximum-capacity</name>
    <value>80</value>
    <description>Maximum queue capacity in percentage (%) as a float. This limits the elasticity for applications in the queue. Defaults to -1 which disables it.</description>
</property>

<property>
    <name>yarn.scheduler.capacity.root.alpha.minimum-capacity</name>
    <value>10</value>
    <description>Each queue enforces a limit on the percentage of resources allocated to a user at any given time, if there is demand for resources. The user limit can vary between a minimum and maximum value. The former (the minimum value) is set to this property value and the latter (the maximum value) depends on the number of users who have submitted applications. For e.g., suppose the value of this property is 25. If two users have submitted applications to a queue, no single user can use more than 50% of the queue resources. If a third user submits an application, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queues resources. A value of 100 implies no user limits are imposed. The default is 100. Value is specified as a integer.</description>
</property>

links : YARN CapacityScheduler Queue Properties

198

answered Oct 01 '22 15:10

mrsrinivas

Related questions
                            
                                Flink - No FileSystem for scheme: hdfs
                            
                                Spark and Hive in Hadoop 3: Difference between metastore.catalog.default and spark.sql.catalogImplementation
                            
                                When was the first version of Hadoop released? [closed]
                            
                                How does one implement a Hadoop Mapper in Scala 2.9.0?
                            
                                hbase.MasterNotRunningException while creating table in Hbase
                            
                                Pass directories not files to hadoop-streaming?
                            
                                Exit pig shell command safely
                            
                                What is the difference between job.submit and job.waitForComplete in Apache Hadoop?
                            
                                What is significance of the Oozie MR launcher?
                            
                                Apache Nutch: Get outlink URL's text context
                            
                                Hadoop YARN how to determine the number of containers
                            
                                Cassandra + Solr/Hadoop/Spark - Choosing the right tools
                            
                                Apache flume twitter agent not streaming data
                            
                                Hadoop command line -D options not working
                            
                                Namenode HA (UnknownHostException: nameservice1)
                            
                                Hadoop Error - All data nodes are aborting
                            
                                hadoop warn EBADF: Bad file descriptor
                            
                                Pydoop stucks on readline from HDFS files
                            
                                Spark Task not serializable (Case Classes)
                            
                                Why is Dockerized Hadoop datanode registering with the wrong IP address?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Hadoop Yarn: How to limit dynamic self allocation of resources with Spark?

Tags:

apache-spark

hadoop

pyspark

hadoop-yarn

Sergey Bushmanov

People also ask

1 Answers

mrsrinivas

Recent Activity

Donate For Us