What is the difference between the fair and capacity schedulers?

Tags:

I am new to the world of Hadoop and want to know the difference between fair and capacity schedulers. Also when are we supposed to use each one? Please answer in a simple way because I read many things on the Internet but I don't get much from them.

703

asked Oct 24 '14 10:10

Flowra

1 Answers

Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also a reasonable way to share a cluster between a number of users. Finally, fair sharing can also work with job priorities - the priorities are used as weights to determine the fraction of total compute time that each job should get.

The CapacityScheduler is designed to allow sharing a large cluster while giving each organization a minimum capacity guarantee. The central idea is that the available resources in the Hadoop Map-Reduce cluster are partitioned among multiple organizations who collectively fund the cluster based on computing needs. There is an added benefit that an organization can access any excess capacity no being used by others. This provides elasticity for the organizations in a cost-effective manner.

answered Oct 05 '22 23:10

user3484461

Related questions
                            
                                Hive: dynamic partition adding to external table
                            
                                Overriding default hadoop jars in class path
                            
                                Amazon Emr - What is the need of Task nodes when we have Core nodes?
                            
                                Hadoop, Mahout real-time processing alternative
                            
                                Slow transfers in Jetty with chunked transfer encoding at certain buffer size
                            
                                hbase cannot find an existing table
                            
                                Rstudio-server environment variables not loading?
                            
                                What is the fastest way to bulk load data into HBase programmatically?
                            
                                Accessing Hue on Cloudera Docker QuickStart
                            
                                Reading and Writing Sequencefile using Hadoop 2.0 Apis
                            
                                hadoop and hbase rebalancing after node additions
                            
                                AWS Glue issue with double quote and commas
                            
                                What is the most mature library for building a Data Analytics Pipeline in Java/Scala for Hadoop?
                            
                                How to test if a kinit is needed?
                            
                                Got InterruptedException while executing word count mapreduce job
                            
                                Transfer file out from HDFS
                            
                                Difference between Hadoop Map Reduce and Google Map Reduce
                            
                                The type HTable(config,tablename) is deprecated. What use instead?
                            
                                hadoop MultipleInputs fails with ClassCastException
                            
                                what is the basic difference between jobconf and job?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between the fair and capacity schedulers?

Tags:

hadoop

scheduler

Flowra

People also ask

1 Answers

user3484461

Recent Activity

Donate For Us