Вопрос

What is the difference between a mapper and a map task? Similarly, a reducer and a reduce task? Also, how are number of mappers,maptasks,reducers,reducetasks determined during the execution of a mapreduce task? Give interrelationships between them if there is any.

Это было полезно?

Решение

Simply map task is an instance of Mapper. Mapper and reducer are methods in mapreduce jobs.

When we run a mapreduce job, number of map tasks spawned depends on the number blocks(number of blocks depend on input splits) in the input. However the number of reduce tasks can be specified in the mapreduce driver code. Either it can be specified by setting property mapred.reduce.tasks in the job configuration object or org.apache.hadoop.mapreduce.Job#setNumReduceTasks(int reducerCount); method can be used.

In the old JobConf API setNumMapTasks() method was there. But setNumMapTasks() method is removed in the new API org.apache.hadoop.mapreduce.Jobwith the intension of number of mappers should be calculated based on the input splits.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top