I have heard the term AM limit a couple of times in the context of running jobs in a yarn Big Data cluster.
Its also mentioned here: https://issues.apache.org/jira/browse/YARN-6428
What does it mean?
It's a setting to guarantee you don't livelock your cluster. A Map-Reduce job has an AM and that spawns mappers and reducers. If your queue only has AM tasks then you cannot run any mappers or reducers which means none of your AMs will complete and you cannot do any meaningful work. You're in a live-lock scenario.
Both Capacity Scheduler and Fair Scheduler have a way to limit the percentage of tasks that can be held by AMs. In Capacity Scheduler look for yarn.scheduler.capacity.maximum-am-resource-percent
. In Fair Scheduler look for maxAMShare
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With