I have an Hadoop 2.2 cluster deployed on a small number of powerful machines. I have a constraint to use YARN as the framework, which I am not very familiar with.
Thanks in advance for helping me melt these machines :)
1.
In MR1, the mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum properties dictated how many map and reduce slots each TaskTracker had.
These properties no longer exist in YARN. Instead, YARN uses yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores, which control the amount of memory and CPU on each node, both available to both maps and reduces
Essentially:
YARN has no TaskTrackers, but just generic NodeManagers. Hence, there's no more Map slots and Reduce slots separation. Everything depends on the amount of memory in use/demanded
2.
Using the web UI you can get lot of monitoring/admin kind of info:
NameNode - http://:50070/
Resource Manager - http://:8088/
In addition Apache Ambari is meant for this: http://ambari.apache.org/
And Hue for interfacing with the Hadoop/YARN cluster in many ways: http://gethue.com/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With