Suppose a yarn application has long-running tasks (running for 1 hour or longer). When a MR job starts, all cluster resources are blocked, at least until one container is finished, which sometimes can take a long time. Is there a way to limit the number of simultaneously running containers? Something along the lines, e.g. map.vcores.max (per NM, or globally). So the other applications are not blocked. Any ideas? ps. Hadoop 2.3.0

This behaviour/feature can be handled per framework level rather than in YARN. In Mapreduce, <code>mapreduce.job.running.map.limit</code> and <code>mapreduce.job.running.reduce.limit</code> can be used to limit the simultaneously running containers. In Tez, It can handled using the property <code>tez.am.vertex.max-task-concurrency</code> Related Jira - https://issues.apache.org/jira/browse/MAPREDUCE-5583 https://issues.apache.org/jira/browse/TEZ-2914

Limit number of simultaneously running containers per application in yarn

Tags:

java

distributed-computing

hadoop

scheduling

hadoop-yarn

Suppose a yarn application has long-running tasks (running for 1 hour or longer). When a MR job starts, all cluster resources are blocked, at least until one container is finished, which sometimes can take a long time.

Is there a way to limit the number of simultaneously running containers? Something along the lines, e.g. map.vcores.max (per NM, or globally). So the other applications are not blocked.

Any ideas?

ps. Hadoop 2.3.0

841

asked Jul 09 '14 16:07

Ivan Balashov

2 Answers

This behaviour/feature can be handled per framework level rather than in YARN.

In Mapreduce, mapreduce.job.running.map.limit and mapreduce.job.running.reduce.limit can be used to limit the simultaneously running containers.

In Tez, It can handled using the property tez.am.vertex.max-task-concurrency

Related Jira -
https://issues.apache.org/jira/browse/MAPREDUCE-5583
https://issues.apache.org/jira/browse/TEZ-2914

183

answered Sep 18 '22 19:09

SachinJ

As far as I can see you cannot directly limit number of containers. This is only determined by resources. So the best you can do is to limit resources per application.

In accordance to Fair scheduler documentation you can assign your application to special queue. In this case you can receive configuration which is pretty close to your task - as you can limit memory or cores resource per queue.

Maybe you can switch to different scheduler or even implement custom one but I don't like this way as doing this you step out of well-tested environment and I don't think you really need to do so much work like custom implementation.

answered Sep 22 '22 19:09

Roman Nikitchenko

Related questions
                            
                                Desktop.open() silently fails on some JREs
                            
                                Android how to get current System Settings values?
                            
                                How to set socket option (TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL) in java or netty?
                            
                                JDK implementation of Thread.join()
                            
                                Making button in JList clickable
                            
                                List branches of a git remote repo without cloning it
                            
                                Difference between loading the driver and registering drivers
                            
                                Wildcards with diamond operator
                            
                                Detect simple curves and lines in the grayscale image
                            
                                Collections remove method doesn't give Concurrent Modification Exception
                            
                                Can someone provide an example of how javac's -implicit option works?
                            
                                Java generic return type not used in parameters
                            
                                Adding constraint violation manually
                            
                                Type inference limitations with lambda expressions
                            
                                Unhandled Alert Exception : Modal Dialog Present (Selenium)
                            
                                Spring OAuth2 "Full authentication is required to access this resource"
                            
                                How to get annotated method parameter and his annotation
                            
                                dynamodb: how to filer all items which do not have a certain attribute?
                            
                                Embedded Tomcat 7 cluster using spring-boot
                            
                                Java IO vs NIO, what is really difference?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With