Question

While configuring a Map Reduce job, I know that one can set the number of reduce tasks by using the method job.setNumReduceTasks(2);.

Can we set the number of map tasks?

I don't see any methods to do this.

If there is no such functionality, does someone know why this framework has the ability to have more than 1 reduce task, but not more than 1 map task?

Was it helpful?

Solution

There used to be property for setting the number of map tasks, which was setNumMapTasks. Bur it was merely a hint to the framework, and could not guarantee that you'll get only the specified number of maps. The map creation is actually governed by the InputFormat you are using in your job. And this is the reason it is not supported anymore.

If you are not happy with the number of mappers created by the framework, you could try tweaking the values of following 2 properties as per your requirements :

- mapred.min.split.size
- mapred.max.split.size

OTHER TIPS

Number of map tasks is not something the programmer sets,rather its something that the hadoop framework,in particular the TaskTracker that creates as many mappers as the number of input splits(generally of 64mb but can be changed) of the InputFile from HDFS...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top