How does YARN decide to create how many containers? (Why the difference between S3a and HDFS?)

Tags:

I'm using the current version of Hadoop, and running some TestDFSIO benchmarks (v. 1.8) to compare the cases where the default file system is HDFS versus the default file system is an S3 bucket (used via S3a).

When reading 100 x 1 MB files with default file system as S3a, I observe the number of max containers in YARN Web UI is less than the case for HDFS as default, and S3a is about 4 times slower.

When reading 1000 x 10 KB files with default file system as S3a, I observe the number of max containers in YARN Web UI is at least 10 times less than the case for HDFS as default, and S3a is about 16 times slower. (E.g. 50 seconds of test execution time with HDFS default, versus 16 minutes of test execution time with S3a default.)

The number of Launched Map Tasks is as expected in each case, there's no difference with respect to that. But why is YARN creating at least 10 times less number of containers (e.g. 117 on HDFS versus 8 on S3a)? How does YARN decide to create how many number of containers when the cluster's vcores, RAM, and the job's input splits, and launched map tasks are the same; and only the storage back-end is different?

It might be of course fine to expect a performance difference between HDFS versus Amazon S3 (via S3a) when running the same TestDFSIO jobs, what I'm after is understanding how YARN is deciding the number of max containers it launches during those jobs, where only the default file system is changed, because currently, it is like, when the default file system is S3a, YARN is almost not using 90% of the parallelism (which it normally does when the default file system is HDFS).

The cluster is a 15-node cluster, with 1 NameNode, 1 ResourceManager (YARN), and 13 DataNodes (worker nodes). Each node has 128 GB RAM, and 48-core CPU. This is a dedicated testing cluster: during TestDFSIO test runs, nothing else runs on the cluster.

For HDFS, the dfs.blocksize is 256m, and it uses 4 HDDs (dfs.datanode.data.dir is set to file:///mnt/hadoopData1,file:///mnt/hadoopData2,file:///mnt/hadoopData3,file:///mnt/hadoopData4).

For S3a, fs.s3a.block.size is set to 268435456, that is 256m, same as HDFS default block size.

The Hadoop tmp directory is on an SSD (by setting hadoop.tmp.dir to /mnt/ssd1/tmp in core-site.xml, and also setting mapreduce.cluster.local.dir to /mnt/ssd1/mapred/local in mapred-site.xml)

The performance difference (default HDFS, versus default set to S3a) is summarized below:

TestDFSIO v. 1.8  (READ)    

fs.default.name                # of Files x Size of File   Launched Map Tasks   Max # of containers observed in YARN Web UI Test exec time sec
=============================  =========================   ==================   ===========================================     ==================
hdfs://hadoop1:9000            100  x  1 MB                 100                 117                                               19
hdfs://hadoop1:9000            1000 x 10 KB                1000                 117                                               56
s3a://emre-hadoop-test-bucket  100  x  1 MB                 100                  60                                               78
s3a://emre-hadoop-test-bucket  1000 x 10 KB                1000                   8                                             1012

100 x 1 MB - default FS HDFS versus default FS S3a

1000 x 10 KB - default FS HDFS versus default FS S3a

272

asked Mar 01 '16 11:03

Emre Sevinç

1 Answers

Long story short, one of the important criteria YARN uses to decide how many containers to create is based on data locality. When using a non-HDFS file system, such as S3a to connect to Amazon S3, or another S3-compatible object store, it is the responsibility of the file system to give information about the data locality because in such a case, none of the data is local to a node, each node needs to retrieve the data from the network, or, from another perspective, each node has the same data locality.

The previous paragraph explains the container creation behavior I've observed when running Hadoop MapReduce jobs against Amazon S3, using S3a file system. In order to fix the issue, I've started to work on a patch, and the development will be tracked via HADOOP-12878.

Also see the following:

SO question:hadoop/yarn and task parallelization on non-hdfs filesystems
Mail list thread: hadoop/yarn and task parallelization on non-hdfs filesystems

164

answered Sep 22 '22 13:09

Emre Sevinç

Related questions
                            
                                Hadoop in windows : file not found exception
                            
                                Hadoop: intermediate merge failed
                            
                                Hadoop dfs error : INFO ipc.Client: Retrying connect to server: localhost
                            
                                Writing files in hdfs in C++ (libhdfs)
                            
                                InvalidRequestException(why:empid cannot be restricted by more than one relation if it includes an Equal)
                            
                                Accumulo high speed ingest options
                            
                                How to find the right portion between hadoop instance types
                            
                                Hive unit testing on windows without hadoop setup
                            
                                Can I use Hadoop with AWS4-HMAC-SHA256?
                            
                                Hadoop multiple outputs with speculative execution
                            
                                hadoop 2.4.0 streaming generic parser options using TAB as separator
                            
                                Maximum number of Apache Nutch worker instances
                            
                                Oozie and Job History Server configuration problems

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does YARN decide to create how many containers? (Why the difference between S3a and HDFS?)

Tags:

amazon-s3

hadoop

hadoop-yarn

Emre Sevinç

People also ask

1 Answers

Emre Sevinç

Recent Activity

Donate For Us