I am using the hadoop hadoop-2.2.0. I can start historyserver in the master node and slave node?
But I am not sure do i need start the history server on the slave node?
If I start one history server on master, can i get all the logs of all jobs?
If I need start all the servers in both master and slave node, is there any command to start all using one command? Not start each server one by one.
Any comments are welcome.
JobTracker or ResourceManager keeps all job information in memory. For finished jobs, it drops them to avoid running out of memory. Tracking of these past jobs are delegated to JobHistory server.
You need only one historyserver. It can run on any node you like, including a dedicated node of its own, but traditionally runs on the same node as the resourcemanager. The one history server is declared in mapred-site.xml:
mapreduce.jobhistory.address
: MapReduce JobHistory Server host:port Default port is 10020.mapreduce.jobhistory.webapp.address
: MapReduce JobHistory Server Web UI host:port Default port is 19888.mapreduce.jobhistory.intermediate-done-dir
: Directory where history files are written by MapReduce jobs (in HDFS). Default is /mr-history/tmp
mapreduce.jobhistory.done-dir
: Directory where history files are managed by the MR JobHistory Server (in HDFS). Default is /mr-history/done
You can access the history via the historyserver REST API, you do not access directly the internal history files. For casual browsing, the history is available in the resouremanager web UI.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With