what is the default value for mapred.tasktracker.tasks.maximum in hadoop configuration

StackOverflow https://stackoverflow.com/questions/21249952

  •  30-09-2022
  •  | 
  •  

Domanda

i found this configuration name in the link http://wiki.apache.org/hadoop/HowManyMapsAndReduces

However when i tried to search the hadoop documentation i am finding the configuration name as

 mapred.tasktracker.reduce.tasks.maximum   default value 2
 mapred.tasktracker.map.tasks.maximum      default value 2

http://hadoop.apache.org/docs/r1.1.1/mapred-default.html however i am not able to find mapred.tasktracker.tasks.maximum ? please suggest if am missing some obvious understanding?

È stato utile?

Soluzione

The first link explains how many mappers (just an indication) and reducers you should set for your MapReduce job, so that you can achieve better load balancing.

The second thing that you mention is how many map tasks and reduce tasks can run at the same time in each node. In http://hadoop.apache.org/docs/r1.1.1/mapred-default.html these configurations appear as:

mapred.tasktracker.map.tasks.maximum         2  
mapred.tasktracker.reduce.tasks.maximum      2

If you want to change them, then you should change the file {$HADOOP_HOME}/conf/mapred-site.xml, where ${HADOOP_HOME} is the path of hadoop.

For example, if you determine that you want 8 reducers (this can be done by setting conf.setNumReduceTasks(8); in your code) and you keep these default values, assuming that you have 2 nodes in the cluster, each node will run 2 map tasks at the beginning, so, in overall, 2x2 = 4 map tasks will be running in your cluster. When any of these map tasks finishes, the node will run the next map task in the queue. At any point, 4 map tasks (maximum) will be running in your cluster.

EDIT: I found the mistake. In the first link it says:

The right number of reduces seems to be 0.95 or 1.75 * (nodes * mapred.tasktracker.tasks.maximum).

It should be:

The right number of reduces seems to be 0.95 or 1.75 * (nodes * mapred.tasktracker.reduce.tasks.maximum).

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top