What's the right way to use historyserver of hadoop 2.2?

https://stackoverflow.com/questions/21843276

13-10-2022
|

Question

I am using the hadoop hadoop-2.2.0. I can start historyserver in the master node and slave node?

But I am not sure do i need start the history server on the slave node?
If I start one history server on master, can i get all the logs of all jobs?
If I need start all the servers in both master and slave node, is there any command to start all using one command? Not start each server one by one.

Any comments are welcome.

La solution

You need only one historyserver. It can run on any node you like, including a dedicated node of its own, but traditionally runs on the same node as the resourcemanager. The one history server is declared in mapred-site.xml:

mapreduce.jobhistory.address: MapReduce JobHistory Server host:port Default port is 10020.
mapreduce.jobhistory.webapp.address: MapReduce JobHistory Server Web UI host:port Default port is 19888.
mapreduce.jobhistory.intermediate-done-dir: Directory where history files are written by MapReduce jobs (in HDFS). Default is /mr-history/tmp
mapreduce.jobhistory.done-dir: Directory where history files are managed by the MR JobHistory Server (in HDFS). Default is /mr-history/done

You can access the history via the historyserver REST API, you do not access directly the internal history files. For casual browsing, the history is available in the resouremanager web UI.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow